Google Drive Single Excel File Analysis with Smart Routing

Comprehensive single Excel file analysis workflow with Google Drive integration, conditional routing, and loop-based sheet processing

Back

Workflow Information

ID: excel_single_file_analysis_workflow

Namespace: excel_analysis

Version: 2.0

Created: 2025-07-31

Updated: 2025-07-31

Tasks: 3

Quick Actions

Manage Secrets

Inputs

Name	Type	Required	Default
`nango_connection_id`	string	Required	`4274993f-c614-4efa-a01e-8d07422f4b09`
`nango_key`	string	Required	`8df3e2de-2307-48d3-94bd-ddd3fd6a62ec`
`file_id`	string	Required	`1Glz-mTMDCuGzUGoTLWOrSyIMI1sJDYlA`
`analysis_depth`	string	Optional	`comprehensive`
`target_sheets`	string	Optional	`Volumes,Power & Fuel,Per tonne,P&L,Operating metrics,Cash Flow,Balance Sheet,Financial Ratios,Valuation`
`results_file_name`	string	Optional	`Excel_Analysis_Results`
`analysis_years`	string	Optional	`FY2024`
`gemini_api_key`	string	Required	`AIzaSyB0_e6aU4gF-qRapMm3UYBSITpbd0ehsYk`

Outputs

Name	Type	Source
`excel_created`	boolean	Whether Excel file was successfully created
`final_excel_url`	string	Direct link to final Excel analysis file
`sheets_available`	integer	Number of sheets available for processing
`sheets_processed`	integer	Number of sheets successfully processed
`processing_status`	string	Overall processing status
`total_extractions`	integer	Total number of data extractions performed
`final_excel_file_id`	string	Google Drive file ID of final Excel analysis
`json_download_status`	string	Status of JSON file download

Tasks

analyze_excel_file

script

Complete Excel analysis workflow in a single readable script

Python Script:

import json
import requests
import pandas as pd
import io
from datetime import datetime
from typing import Dict, List, Any

# =============================================================================
# CONFIGURATION
# =============================================================================
def get_configuration():
    """Get and validate input configuration."""
    config = {
        'nango_connection_id': "${nango_connection_id}",
        'nango_key': "${nango_key}",
        'file_id': "${file_id}".strip(),
        'analysis_depth': "${analysis_depth}",
        'target_sheets': [sheet.strip() for sheet in "${target_sheets}".split(',') if sheet.strip()],
        'results_file_name': "${results_file_name}"
    }
    
    print("📋 EXCEL FILE ANALYSIS CONFIGURATION")
    print("=" * 50)
    print(f"File ID: {config['file_id']}")
    print(f"Analysis Depth: {config['analysis_depth']}")
    print(f"Target Sheets: {len(config['target_sheets'])}")
    for i, sheet in enumerate(config['target_sheets'], 1):
        print(f"  {i}. {sheet}")
    print(f"Results File Name: {config['results_file_name']}")
    print()
    
    return config

# =============================================================================
# GOOGLE DRIVE AUTHENTICATION
# =============================================================================
def authenticate_google_drive(nango_connection_id: str, nango_key: str) -> str:
    """Authenticate with Google Drive and return access token."""
    print("🔐 Authenticating with Google Drive...")
    
    auth_url = f"https://auth-dev.assistents.ai/connection/{nango_connection_id}?provider_config_key=google-drive-hq3h"
    headers = {
        'Authorization': f'Bearer {nango_key}',
        'Content-Type': 'application/json'
    }
    
    try:
        response = requests.get(auth_url, headers=headers, timeout=15)
        response.raise_for_status()
        
        auth_data = response.json()
        access_token = auth_data.get('credentials', {}).get('access_token')
        
        if not access_token:
            raise Exception("No access token received from authentication")
            
        print("✅ Authentication successful")
        return access_token
        
    except Exception as e:
        raise Exception(f"Authentication failed: {str(e)}")

# =============================================================================
# FILE DOWNLOAD AND INFO
# =============================================================================
def get_file_info(file_id: str, access_token: str) -> Dict[str, Any]:
    """Get file information from Google Drive."""
    print("📄 Getting file information...")
    
    headers = {'Authorization': f'Bearer {access_token}'}
    url = f"https://www.googleapis.com/drive/v3/files/{file_id}"
    params = {'fields': 'id,name,mimeType,size,createdTime,modifiedTime'}
    
    try:
        response = requests.get(url, headers=headers, params=params, timeout=15)
        response.raise_for_status()
        
        file_info = response.json()
        print(f"✅ File: {file_info['name']}")
        print(f"   Type: {file_info['mimeType']}")
        print(f"   Size: {file_info.get('size', 'N/A')} bytes")
        
        return file_info
        
    except Exception as e:
        raise Exception(f"Failed to get file info: {str(e)}")

def download_excel_file(file_id: str, file_info: Dict[str, Any], access_token: str) -> bytes:
    """Download Excel file content."""
    print("⬇️ Downloading Excel file...")
    
    headers = {'Authorization': f'Bearer {access_token}'}
    mime_type = file_info.get('mimeType')
    
    # Handle Google Sheets vs Excel files
    if mime_type == 'application/vnd.google-apps.spreadsheet':
        url = f"https://www.googleapis.com/drive/v3/files/{file_id}/export"
        params = {'mimeType': 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'}
    else:
        url = f"https://www.googleapis.com/drive/v3/files/{file_id}"
        params = {'alt': 'media'}
    
    try:
        response = requests.get(url, headers=headers, params=params, timeout=60)
        response.raise_for_status()
        
        print("✅ File downloaded successfully")
        return response.content
        
    except Exception as e:
        raise Exception(f"Failed to download file: {str(e)}")

# =============================================================================
# EXCEL ANALYSIS
# =============================================================================
def analyze_excel_structure(file_content: bytes, target_sheets: List[str]) -> Dict[str, Any]:
    """Analyze Excel file structure and identify target sheets."""
    print("🔍 Analyzing Excel file structure...")
    
    try:
        excel_file = pd.ExcelFile(io.BytesIO(file_content))
        available_sheets = excel_file.sheet_names
        
        print(f"✅ Found {len(available_sheets)} sheets:")
        for i, sheet in enumerate(available_sheets, 1):
            print(f"  {i}. {sheet}")
        
        # Match target sheets with available sheets (case-insensitive)
        matched_sheets = []
        for available_sheet in available_sheets:
            for target_sheet in target_sheets:
                if available_sheet.lower().strip() == target_sheet.lower().strip():
                    matched_sheets.append({
                        'sheet_name': available_sheet,
                        'target_name': target_sheet
                    })
                    break
        
        print(f"\n🎯 Target sheets found: {len(matched_sheets)}")
        for sheet in matched_sheets:
            print(f"  ✓ {sheet['sheet_name']} (matches '{sheet['target_name']}')")
        
        return {
            'excel_file': excel_file,
            'available_sheets': available_sheets,
            'matched_sheets': matched_sheets
        }
        
    except Exception as e:
        raise Exception(f"Failed to analyze Excel structure: {str(e)}")

def extract_sheet_data(excel_file, sheet_name: str, analysis_depth: str) -> Dict[str, Any]:
    """Extract data from a specific sheet based on analysis depth."""
    print(f"📊 Analyzing sheet: {sheet_name}")
    
    try:
        df = pd.read_excel(excel_file, sheet_name=sheet_name, header=None)
        
        if df.empty:
            print(f"  ⚠️ Sheet '{sheet_name}' is empty")
            return {
                'sheet_name': sheet_name,
                'status': 'empty',
                'values_extracted': [],
                'total_values': 0,
                'dimensions': {'rows': 0, 'columns': 0}
            }
        
        print(f"  📏 Dimensions: {df.shape[0]} rows × {df.shape[1]} columns")
        
        # Extract data based on analysis depth
        extracted_values = []
        
        if analysis_depth == "basic":
            # Basic: Only first column, first 10 rows
            if len(df.columns) > 0:
                col_values = df.iloc[:10, 0].dropna().astype(str).str.strip()
                extracted_values.extend([v for v in col_values if v and not v.startswith('Unnamed:')])
                
        elif analysis_depth == "comprehensive":
            # Comprehensive: First 2 columns, all rows
            for col_idx in range(min(2, len(df.columns))):
                col_values = df.iloc[:, col_idx].dropna().astype(str).str.strip()
                extracted_values.extend([v for v in col_values if v and not v.startswith('Unnamed:')])
                
        else:  # detailed
            # Detailed: First 3 columns, all rows, plus data type info
            for col_idx in range(min(3, len(df.columns))):
                col_values = df.iloc[:, col_idx].dropna().astype(str).str.strip()
                extracted_values.extend([v for v in col_values if v and not v.startswith('Unnamed:')])
        
        # Remove duplicates while preserving order
        unique_values = []
        seen = set()
        for value in extracted_values:
            if value not in seen:
                seen.add(value)
                unique_values.append(value)
        
        print(f"  ✅ Extracted {len(unique_values)} unique values")
        
        result = {
            'sheet_name': sheet_name,
            'status': 'success',
            'values_extracted': unique_values,
            'total_values': len(unique_values),
            'dimensions': {'rows': df.shape[0], 'columns': df.shape[1]},
            'analysis_depth': analysis_depth
        }
        
        if analysis_depth == "detailed":
            # Add data type analysis for detailed mode
            data_types = {}
            for col_idx in range(min(3, len(df.columns))):
                col_data = df.iloc[:, col_idx].dropna()
                if len(col_data) > 0:
                    data_types[f"column_{col_idx}"] = {
                        "data_type": str(col_data.dtype),
                        "sample_values": col_data.head(3).tolist()
                    }
            result['data_type_analysis'] = data_types
        
        return result
        
    except Exception as e:
        print(f"  ❌ Error analyzing sheet '{sheet_name}': {str(e)}")
        return {
            'sheet_name': sheet_name,
            'status': 'error',
            'error': str(e),
            'total_values': 0
        }

def process_all_sheets(excel_analysis: Dict[str, Any], analysis_depth: str) -> Dict[str, Any]:
    """Process all matched sheets and compile results."""
    print("\n📚 Processing all target sheets...")
    print("=" * 40)
    
    excel_file = excel_analysis['excel_file']
    matched_sheets = excel_analysis['matched_sheets']
    
    if not matched_sheets:
        print("⚠️ No target sheets found to process")
        return {
            'processed_sheets': [],
            'total_values_extracted': 0,
            'processing_errors': [],
            'success_count': 0
        }
    
    processed_sheets = []
    total_values = 0
    processing_errors = []
    success_count = 0
    
    for i, sheet_info in enumerate(matched_sheets, 1):
        print(f"\n🔄 Processing sheet {i}/{len(matched_sheets)}: {sheet_info['sheet_name']}")
        
        sheet_result = extract_sheet_data(excel_file, sheet_info['sheet_name'], analysis_depth)
        processed_sheets.append(sheet_result)
        
        if sheet_result['status'] == 'success':
            total_values += sheet_result['total_values']
            success_count += 1
        elif sheet_result['status'] == 'error':
            processing_errors.append(sheet_result)
    
    print(f"\n✅ Processing complete!")
    print(f"   Sheets processed: {len(processed_sheets)}")
    print(f"   Successful: {success_count}")
    print(f"   Errors: {len(processing_errors)}")
    print(f"   Total values extracted: {total_values}")
    
    return {
        'processed_sheets': processed_sheets,
        'total_values_extracted': total_values,
        'processing_errors': processing_errors,
        'success_count': success_count
    }

# =============================================================================
# RESULTS UPLOAD
# =============================================================================
def upload_results_to_drive(results: Dict[str, Any], file_id: str, file_info: Dict[str, Any], 
                           config: Dict[str, Any], access_token: str) -> Dict[str, Any]:
    """Upload comprehensive results to Google Drive as JSON file."""
    print("\n📤 Uploading results to Google Drive...")
    print("=" * 40)
    
    # Create comprehensive results structure
    comprehensive_results = {
        "metadata": {
            "workflow_id": "excel_single_file_analysis_clean",
            "execution_timestamp": datetime.now().isoformat(),
            "analysis_depth": config['analysis_depth'],
            "source_file_id": file_id,
            "target_sheets_requested": config['target_sheets']
        },
        "execution_summary": {
            "file_id": file_id,
            "file_name": file_info.get('name', 'Unknown'),
            "file_size": file_info.get('size', 'N/A'),
            "sheets_processed": len(results['processed_sheets']),
            "successful_sheets": results['success_count'],
            "failed_sheets": len(results['processing_errors']),
            "total_values_extracted": results['total_values_extracted'],
            "processing_timestamp": datetime.now().isoformat()
        },
        "detailed_results": {
            "sheet_analysis": results['processed_sheets'],
            "processing_errors": results['processing_errors']
        },
        "source_file_metadata": file_info
    }
    
    # Convert to JSON
    json_content = json.dumps(comprehensive_results, indent=2, ensure_ascii=False)
    print(f"📄 Generated results JSON ({len(json_content)} characters)")
    
    # Create filename with timestamp
    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
    filename = f"{config['results_file_name']}_{timestamp}.json"
    
    try:
        # Upload to Google Drive using multipart upload
        boundary = '-------314159265358979323846'
        delimiter = f'\r\n--{boundary}\r\n'
        close_delim = f'\r\n--{boundary}--'
        
        metadata = {
            'name': filename,
            'parents': []  # Upload to root folder
        }
        
        body = delimiter + 'Content-Type: application/json\r\n\r\n'
        body += json.dumps(metadata)
        body += delimiter + 'Content-Type: application/json\r\n\r\n'
        body += json_content
        body += close_delim
        
        headers = {
            'Authorization': f'Bearer {access_token}',
            'Content-Type': f'multipart/related; boundary="{boundary}"'
        }
        
        upload_url = 'https://www.googleapis.com/upload/drive/v3/files?uploadType=multipart'
        response = requests.post(upload_url, headers=headers, data=body, timeout=60)
        response.raise_for_status()
        
        upload_result = response.json()
        uploaded_file_id = upload_result['id']
        file_url = f"https://drive.google.com/file/d/{uploaded_file_id}/view"
        
        print("✅ Results uploaded successfully!")
        print(f"   File ID: {uploaded_file_id}")
        print(f"   Filename: {filename}")
        print(f"   URL: {file_url}")
        
        return {
            'upload_success': True,
            'uploaded_file_id': uploaded_file_id,
            'filename': filename,
            'file_url': file_url,
            'file_size_bytes': len(json_content)
        }
        
    except Exception as e:
        print(f"❌ Failed to upload results: {str(e)}")
        return {
            'upload_success': False,
            'error': str(e)
        }

# =============================================================================
# MAIN EXECUTION
# =============================================================================
def main():
    """Main execution function."""
    try:
        # Get configuration
        config = get_configuration()
        
        # Validate file ID
        if not config['file_id'] or len(config['file_id']) < 10:
            raise Exception("Invalid file ID provided")
        
        # Authenticate with Google Drive
        access_token = authenticate_google_drive(config['nango_connection_id'], config['nango_key'])
        
        # Get file information
        file_info = get_file_info(config['file_id'], access_token)
        
        # Download Excel file
        file_content = download_excel_file(config['file_id'], file_info, access_token)
        
        # Analyze Excel structure
        excel_analysis = analyze_excel_structure(file_content, config['target_sheets'])
        
        # Process all sheets
        processing_results = process_all_sheets(excel_analysis, config['analysis_depth'])
        
        # Upload results to Google Drive
        upload_results = upload_results_to_drive(
            processing_results, config['file_id'], file_info, config, access_token
        )
        
        # Compile final output
        final_results = {
            'status': 'success',
            'file_id': config['file_id'],
            'file_name': file_info.get('name', 'Unknown'),
            'analysis_depth': config['analysis_depth'],
            'sheets_processed': len(processing_results['processed_sheets']),
            'successful_sheets': processing_results['success_count'],
            'total_values_extracted': processing_results['total_values_extracted'],
            'upload_success': upload_results['upload_success'],
            'uploaded_file_id': upload_results.get('uploaded_file_id'),
            'results_file_url': upload_results.get('file_url'),
            'results_filename': upload_results.get('filename'),
            'processing_timestamp': datetime.now().isoformat()
        }
        
        if not upload_results['upload_success']:
            final_results['upload_error'] = upload_results.get('error')
        
        print("\n" + "=" * 70)
        print("🎉 EXCEL ANALYSIS COMPLETE!")
        print("=" * 70)
        print(f"✅ File: {final_results['file_name']}")
        print(f"📊 Sheets processed: {final_results['sheets_processed']}")
        print(f"📈 Values extracted: {final_results['total_values_extracted']}")
        
        if final_results['upload_success']:
            print(f"📤 Results uploaded: {final_results['results_filename']}")
            print(f"🔗 Access at: {final_results['results_file_url']}")
        else:
            print(f"⚠️ Upload failed: {final_results.get('upload_error', 'Unknown error')}")
        
        return final_results
        
    except Exception as e:
        error_result = {
            'status': 'error',
            'error': str(e),
            'file_id': config.get('file_id', 'unknown') if 'config' in locals() else 'unknown',
            'sheets_processed': 0,
            'total_values_extracted': 0,
            'upload_success': False,
            'processing_timestamp': datetime.now().isoformat()
        }
        
        print(f"\n❌ ANALYSIS FAILED: {str(e)}")
        return error_result

# Execute main function and output results
results = main()
print(f"\n__OUTPUTS__ {json.dumps(results)}")

download_json_results

script

Download and parse the JSON results file

Python Script:

import json
import requests
from typing import Dict, List, Any

def authenticate_google_drive(nango_connection_id: str, nango_key: str) -> str:
    """Authenticate with Google Drive and return access token."""
    print("🔐 Authenticating with Google Drive...")
    
    auth_url = f"https://auth-dev.assistents.ai/connection/{nango_connection_id}?provider_config_key=google-drive-hq3h"
    headers = {
        'Authorization': f'Bearer {nango_key}',
        'Content-Type': 'application/json'
    }
    
    try:
        response = requests.get(auth_url, headers=headers, timeout=15)
        response.raise_for_status()
        
        auth_data = response.json()
        access_token = auth_data.get('credentials', {}).get('access_token')
        
        if not access_token:
            raise Exception("No access token received from authentication")
            
        print("✅ Authentication successful")
        return access_token
        
    except Exception as e:
        raise Exception(f"Authentication failed: {str(e)}")

def download_json_file(file_id: str, access_token: str) -> Dict[str, Any]:
    """Download JSON file from Google Drive."""
    print("📥 Downloading JSON results file...")
    
    headers = {'Authorization': f'Bearer {access_token}'}
    url = f"https://www.googleapis.com/drive/v3/files/{file_id}"
    params = {'alt': 'media'}
    
    try:
        response = requests.get(url, headers=headers, params=params, timeout=60)
        response.raise_for_status()
        
        json_data = response.json()
        print("✅ JSON file downloaded and parsed successfully")
        return json_data
        
    except Exception as e:
        raise Exception(f"Failed to download JSON file: {str(e)}")

def main():
    try:
        # Get configuration
        nango_connection_id = "${nango_connection_id}"
        nango_key = "${nango_key}"
        results_file_id = "${analyze_excel_file.uploaded_file_id}"
        
        print("📋 DOWNLOADING JSON RESULTS")
        print("=" * 50)
        print(f"File ID: {results_file_id}")
        
        # Authenticate and download
        access_token = authenticate_google_drive(nango_connection_id, nango_key)
        json_data = download_json_file(results_file_id, access_token)
        
        # Extract key information
        execution_summary = json_data.get('execution_summary', {})
        detailed_results = json_data.get('detailed_results', {})
        sheet_analysis = detailed_results.get('sheet_analysis', [])
        
        print(f"📊 Found {len(sheet_analysis)} sheets to process")
        
        result = {
            'status': 'success',
            'json_data': json_data,
            'sheet_count': len(sheet_analysis),
            'sheets_available': [sheet['sheet_name'] for sheet in sheet_analysis if sheet['status'] == 'success'],
            'total_values': sum(sheet['total_values'] for sheet in sheet_analysis if sheet['status'] == 'success'),
            'source_file_name': execution_summary.get('file_name', 'Unknown')
        }
        
        print(f"✅ Ready to process {result['sheet_count']} sheets")
        print(f"📝 Total values extracted: {result['total_values']}")
        
        return result
        
    except Exception as e:
        return {
            'status': 'error',
            'error': str(e),
            'sheet_count': 0,
            'total_values': 0
        }

# Execute and output results
results = main()
print(f"\n__OUTPUTS__ {json.dumps(results)}")

create_dynamic_models_and_process

script

Create dynamic Pydantic models from JSON and process with Gemini for each analysis year

Python Script:

import json
import time
import io
from datetime import datetime
from typing import Dict, List, Any, Optional
from pydantic import BaseModel, Field, create_model
import google.generativeai as genai
import requests
import pandas as pd
from openpyxl import Workbook
from openpyxl.styles import PatternFill, Font, Alignment, Border, Side
from openpyxl.utils import get_column_letter

def create_dynamic_pydantic_model(sheet_name: str, values_extracted: List[str]):
    """Create a dynamic Pydantic model based on sheet name and extracted values."""
    print(f"📝 Creating model for sheet: {sheet_name}")
    
    # Clean sheet name for class name
    class_name = sheet_name.replace(' ', '').replace('&', 'And').replace('-', '').replace('/', '')
    class_name = f"{class_name}Model"
    
    # Create field definitions
    field_definitions = {}
    for value in values_extracted:
        # Clean field name
        field_name = value.lower().replace(' ', '_').replace('&', 'and').replace('-', '_')
        field_name = field_name.replace('/', '_').replace('(', '').replace(')', '')
        field_name = field_name.replace('%', 'percent').replace('.', '').replace(',', '')
        field_name = ''.join(c for c in field_name if c.isalnum() or c == '_')
        
        # Ensure field name starts with letter
        if field_name and field_name[0].isdigit():
            field_name = f"field_{field_name}"
        
        if field_name:  # Only add if we have a valid field name
            field_definitions[field_name] = (str, Field(description=value))
    
    # Create the dynamic model
    model = create_model(class_name, **field_definitions)
    
    print(f"✅ Created model '{class_name}' with {len(field_definitions)} fields")
    return model, class_name

def setup_gemini(api_key: str):
    """Setup Gemini API client."""
    genai.configure(api_key=api_key)
    return genai.GenerativeModel('gemini-1.5-flash')

def process_sheet_with_gemini(model, sheet_data: Dict, dynamic_model, model_name: str, analysis_year: str) -> Optional[Dict]:
    """Process sheet data with Gemini using dynamic model."""
    try:
        sheet_name = sheet_data['sheet_name']
        values_extracted = sheet_data['values_extracted']
        
        print(f"🤖 Processing {sheet_name} for {analysis_year} with Gemini...")
        
        # Create prompt for financial data extraction
        prompt = f"""
        You are a financial data extraction specialist. I need you to extract financial data for {analysis_year} from the following sheet: {sheet_name}.

        Available data points from the sheet:
        {json.dumps(values_extracted, indent=2)}

        Please extract the financial values for {analysis_year} and structure them according to the provided schema.
        
        IMPORTANT INSTRUCTIONS:
        1. Focus specifically on {analysis_year} data
        2. All monetary figures should be in the original units found in the data
        3. If a value is not found or not applicable, use an empty string ""
        4. For ratios and percentages, include the unit (e.g., "2.5x", "15%")
        5. Be accurate and thorough in extraction
        6. Convert all values to strings for consistency

        Extract the data for sheet: {sheet_name}
        """

        # Generate content with Gemini
        response = model.generate_content(
            prompt,
            generation_config=genai.types.GenerationConfig(
                response_mime_type="application/json",
                response_schema=dynamic_model
            )
        )
        
        result = json.loads(response.text)
        print(f"✅ Successfully processed {sheet_name} for {analysis_year}")
        return result
        
    except Exception as e:
        print(f"❌ Error processing {sheet_name} for {analysis_year}: {str(e)}")
        return None

def create_comprehensive_excel(processed_data: Dict[str, Dict[str, Dict]], source_file_name: str) -> bytes:
    """Create comprehensive Excel workbook with processed data."""
    print("📊 Creating comprehensive Excel workbook...")
    
    wb = Workbook()
    wb.remove(wb.active)  # Remove default sheet
    
    # Define styles
    header_fill = PatternFill(start_color="366092", end_color="366092", fill_type="solid")
    header_font = Font(bold=True, color="FFFFFF", size=11)
    year_fill = PatternFill(start_color="DCE6F1", end_color="DCE6F1", fill_type="solid")
    year_font = Font(bold=True, size=10)
    data_font = Font(size=10)
    center_align = Alignment(horizontal='center', vertical='center')
    left_align = Alignment(horizontal='left', vertical='center')
    right_align = Alignment(horizontal='right', vertical='center')
    
    thin_border = Border(
        left=Side(style='thin'), right=Side(style='thin'),
        top=Side(style='thin'), bottom=Side(style='thin')
    )
    
    # Process each sheet
    for sheet_name, year_data in processed_data.items():
        if not year_data:
            continue
            
        ws = wb.create_sheet(title=sheet_name[:31])
        
        # Get all unique metrics across all years
        all_metrics = set()
        for year, data in year_data.items():
            if data:
                all_metrics.update(data.keys())
        
        all_metrics = sorted(list(all_metrics))
        analysis_years = sorted(year_data.keys())
        
        # Create header row
        headers = ['Metric'] + analysis_years
        ws.append(headers)
        
        # Style header row
        for col_idx, header in enumerate(headers, 1):
            cell = ws.cell(row=1, column=col_idx)
            cell.fill = header_fill
            cell.font = header_font
            cell.alignment = center_align
            cell.border = thin_border
        
        # Add data rows
        for metric in all_metrics:
            row_data = [metric.replace('_', ' ').title()]
            
            for year in analysis_years:
                value = year_data.get(year, {}).get(metric, "")
                row_data.append(value)
            
            ws.append(row_data)
            
            # Style the row
            current_row = ws.max_row
            for col_idx in range(1, len(headers) + 1):
                cell = ws.cell(row=current_row, column=col_idx)
                cell.border = thin_border
                
                if col_idx == 1:  # Metric name column
                    cell.font = year_font
                    cell.alignment = left_align
                else:  # Data columns
                    cell.font = data_font
                    cell.alignment = right_align
        
        # Adjust column widths
        ws.column_dimensions['A'].width = 40
        for col_idx in range(2, len(headers) + 1):
            ws.column_dimensions[get_column_letter(col_idx)].width = 15
        
        # Freeze panes
        ws.freeze_panes = 'B2'
        
        print(f"✅ Created sheet: {sheet_name} with {len(all_metrics)} metrics")
    
    # Save to bytes
    excel_buffer = io.BytesIO()
    wb.save(excel_buffer)
    excel_buffer.seek(0)
    
    print(f"📁 Excel workbook created with {len(processed_data)} sheets")
    return excel_buffer.getvalue()

def upload_excel_to_drive(excel_content: bytes, filename: str, access_token: str) -> Dict[str, Any]:
    """Upload Excel file to Google Drive."""
    print("📤 Uploading Excel file to Google Drive...")
    
    try:
        boundary = '-------314159265358979323846'
        delimiter = f'\r\n--{boundary}\r\n'
        close_delim = f'\r\n--{boundary}--'
        
        metadata = {
            'name': filename,
            'parents': []  # Upload to root folder
        }
        
        body = delimiter + 'Content-Type: application/json\r\n\r\n'
        body += json.dumps(metadata)
        body += delimiter + 'Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet\r\n\r\n'
        body += excel_content.decode('latin-1')
        body += close_delim
        
        headers = {
            'Authorization': f'Bearer {access_token}',
            'Content-Type': f'multipart/related; boundary="{boundary}"'
        }
        
        upload_url = 'https://www.googleapis.com/upload/drive/v3/files?uploadType=multipart'
        response = requests.post(upload_url, headers=headers, data=body.encode('latin-1'), timeout=120)
        response.raise_for_status()
        
        upload_result = response.json()
        file_url = f"https://drive.google.com/file/d/{upload_result['id']}/view"
        
        print("✅ Excel file uploaded successfully!")
        return {
            'success': True,
            'file_id': upload_result['id'],
            'file_url': file_url
        }
        
    except Exception as e:
        print(f"❌ Failed to upload Excel file: {str(e)}")
        return {
            'success': False,
            'error': str(e)
        }

def authenticate_google_drive(nango_connection_id: str, nango_key: str) -> str:
    """Authenticate with Google Drive."""
    auth_url = f"https://auth-dev.assistents.ai/connection/{nango_connection_id}?provider_config_key=google-drive-hq3h"
    headers = {
        'Authorization': f'Bearer {nango_key}',
        'Content-Type': 'application/json'
    }
    
    response = requests.get(auth_url, headers=headers, timeout=15)
    response.raise_for_status()
    
    auth_data = response.json()
    return auth_data['credentials']['access_token']

def main():
    try:
        # Get configuration
        gemini_api_key = "${gemini_api_key}"
        analysis_years = [year.strip() for year in "${analysis_years}".split(',')]
        json_data = ${download_json_results.json_data}
        source_file_name = "${download_json_results.source_file_name}"
        nango_connection_id = "${nango_connection_id}"
        nango_key = "${nango_key}"
        
        print("🚀 PROCESSING WITH DYNAMIC MODELS AND GEMINI")
        print("=" * 60)
        print(f"Analysis Years: {analysis_years}")
        print(f"Source File: {source_file_name}")
        
        # Setup Gemini
        model = setup_gemini(gemini_api_key)
        
        # Get sheet analysis data
        sheet_analysis = json_data['detailed_results']['sheet_analysis']
        successful_sheets = [sheet for sheet in sheet_analysis if sheet['status'] == 'success']
        
        print(f"📊 Processing {len(successful_sheets)} successful sheets")
        
        # Process each sheet for each analysis year
        processed_data = {}
        
        for sheet_data in successful_sheets:
            sheet_name = sheet_data['sheet_name']
            values_extracted = sheet_data['values_extracted']
            
            if not values_extracted:
                print(f"⚠️ Skipping {sheet_name} - no values extracted")
                continue
            
            print(f"\n📋 Processing sheet: {sheet_name}")
            
            # Create dynamic Pydantic model
            dynamic_model, model_name = create_dynamic_pydantic_model(sheet_name, values_extracted)
            
            # Process for each analysis year
            year_results = {}
            for analysis_year in analysis_years:
                result = process_sheet_with_gemini(
                    model, sheet_data, dynamic_model, model_name, analysis_year
                )
                if result:
                    year_results[analysis_year] = result
                
                # Rate limiting
                time.sleep(2)
            
            if year_results:
                processed_data[sheet_name] = year_results
                print(f"✅ Completed processing {sheet_name} for {len(year_results)} years")
            
            # Delay between sheets
            time.sleep(1)
        
        if not processed_data:
            raise Exception("No data was successfully processed")
        
        # Create Excel workbook
        excel_content = create_comprehensive_excel(processed_data, source_file_name)
        
        # Upload to Google Drive
        access_token = authenticate_google_drive(nango_connection_id, nango_key)
        timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
        filename = f"Financial_Analysis_{source_file_name}_{timestamp}.xlsx"
        
        upload_result = upload_excel_to_drive(excel_content, filename, access_token)
        
        # Compile final results
        final_results = {
            'status': 'success',
            'sheets_processed': len(processed_data),
            'analysis_years': analysis_years,
            'source_file': source_file_name,
            'excel_created': upload_result['success'],
            'excel_file_id': upload_result.get('file_id'),
            'excel_file_url': upload_result.get('file_url'),
            'excel_filename': filename,
            'processing_timestamp': datetime.now().isoformat(),
            'total_extractions': sum(len(year_data) for year_data in processed_data.values())
        }
        
        print("\n" + "=" * 70)
        print("🎉 FINANCIAL ANALYSIS COMPLETE!")
        print("=" * 70)
        print(f"✅ Sheets processed: {final_results['sheets_processed']}")
        print(f"📅 Analysis years: {len(analysis_years)}")
        print(f"🔢 Total extractions: {final_results['total_extractions']}")
        
        if final_results['excel_created']:
            print(f"📊 Excel file created: {filename}")
            print(f"🔗 Access at: {final_results['excel_file_url']}")
        else:
            print(f"⚠️ Excel upload failed: {upload_result.get('error', 'Unknown error')}")
        
        return final_results
        
    except Exception as e:
        return {
            'status': 'error',
            'error': str(e),
            'sheets_processed': 0,
            'excel_created': False,
            'processing_timestamp': datetime.now().isoformat()
        }

# Execute main function and output results
results = main()
print(f"\n__OUTPUTS__ {json.dumps(results)}")

YAML Source

id: excel_single_file_analysis_clean
name: Clean Excel File Analysis with Drive Upload
tasks:
- id: analyze_excel_file
  name: Analyze Excel File and Upload Results
  type: script
  script: "import json\nimport requests\nimport pandas as pd\nimport io\nfrom datetime\
    \ import datetime\nfrom typing import Dict, List, Any\n\n# =============================================================================\n\
    # CONFIGURATION\n# =============================================================================\n\
    def get_configuration():\n    \"\"\"Get and validate input configuration.\"\"\"\
    \n    config = {\n        'nango_connection_id': \"${nango_connection_id}\",\n\
    \        'nango_key': \"${nango_key}\",\n        'file_id': \"${file_id}\".strip(),\n\
    \        'analysis_depth': \"${analysis_depth}\",\n        'target_sheets': [sheet.strip()\
    \ for sheet in \"${target_sheets}\".split(',') if sheet.strip()],\n        'results_file_name':\
    \ \"${results_file_name}\"\n    }\n    \n    print(\"\U0001F4CB EXCEL FILE ANALYSIS\
    \ CONFIGURATION\")\n    print(\"=\" * 50)\n    print(f\"File ID: {config['file_id']}\"\
    )\n    print(f\"Analysis Depth: {config['analysis_depth']}\")\n    print(f\"Target\
    \ Sheets: {len(config['target_sheets'])}\")\n    for i, sheet in enumerate(config['target_sheets'],\
    \ 1):\n        print(f\"  {i}. {sheet}\")\n    print(f\"Results File Name: {config['results_file_name']}\"\
    )\n    print()\n    \n    return config\n\n# =============================================================================\n\
    # GOOGLE DRIVE AUTHENTICATION\n# =============================================================================\n\
    def authenticate_google_drive(nango_connection_id: str, nango_key: str) -> str:\n\
    \    \"\"\"Authenticate with Google Drive and return access token.\"\"\"\n   \
    \ print(\"\U0001F510 Authenticating with Google Drive...\")\n    \n    auth_url\
    \ = f\"https://auth-dev.assistents.ai/connection/{nango_connection_id}?provider_config_key=google-drive-hq3h\"\
    \n    headers = {\n        'Authorization': f'Bearer {nango_key}',\n        'Content-Type':\
    \ 'application/json'\n    }\n    \n    try:\n        response = requests.get(auth_url,\
    \ headers=headers, timeout=15)\n        response.raise_for_status()\n        \n\
    \        auth_data = response.json()\n        access_token = auth_data.get('credentials',\
    \ {}).get('access_token')\n        \n        if not access_token:\n          \
    \  raise Exception(\"No access token received from authentication\")\n       \
    \     \n        print(\"\u2705 Authentication successful\")\n        return access_token\n\
    \        \n    except Exception as e:\n        raise Exception(f\"Authentication\
    \ failed: {str(e)}\")\n\n# =============================================================================\n\
    # FILE DOWNLOAD AND INFO\n# =============================================================================\n\
    def get_file_info(file_id: str, access_token: str) -> Dict[str, Any]:\n    \"\"\
    \"Get file information from Google Drive.\"\"\"\n    print(\"\U0001F4C4 Getting\
    \ file information...\")\n    \n    headers = {'Authorization': f'Bearer {access_token}'}\n\
    \    url = f\"https://www.googleapis.com/drive/v3/files/{file_id}\"\n    params\
    \ = {'fields': 'id,name,mimeType,size,createdTime,modifiedTime'}\n    \n    try:\n\
    \        response = requests.get(url, headers=headers, params=params, timeout=15)\n\
    \        response.raise_for_status()\n        \n        file_info = response.json()\n\
    \        print(f\"\u2705 File: {file_info['name']}\")\n        print(f\"   Type:\
    \ {file_info['mimeType']}\")\n        print(f\"   Size: {file_info.get('size',\
    \ 'N/A')} bytes\")\n        \n        return file_info\n        \n    except Exception\
    \ as e:\n        raise Exception(f\"Failed to get file info: {str(e)}\")\n\ndef\
    \ download_excel_file(file_id: str, file_info: Dict[str, Any], access_token: str)\
    \ -> bytes:\n    \"\"\"Download Excel file content.\"\"\"\n    print(\"\u2B07\uFE0F\
    \ Downloading Excel file...\")\n    \n    headers = {'Authorization': f'Bearer\
    \ {access_token}'}\n    mime_type = file_info.get('mimeType')\n    \n    # Handle\
    \ Google Sheets vs Excel files\n    if mime_type == 'application/vnd.google-apps.spreadsheet':\n\
    \        url = f\"https://www.googleapis.com/drive/v3/files/{file_id}/export\"\
    \n        params = {'mimeType': 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'}\n\
    \    else:\n        url = f\"https://www.googleapis.com/drive/v3/files/{file_id}\"\
    \n        params = {'alt': 'media'}\n    \n    try:\n        response = requests.get(url,\
    \ headers=headers, params=params, timeout=60)\n        response.raise_for_status()\n\
    \        \n        print(\"\u2705 File downloaded successfully\")\n        return\
    \ response.content\n        \n    except Exception as e:\n        raise Exception(f\"\
    Failed to download file: {str(e)}\")\n\n# =============================================================================\n\
    # EXCEL ANALYSIS\n# =============================================================================\n\
    def analyze_excel_structure(file_content: bytes, target_sheets: List[str]) ->\
    \ Dict[str, Any]:\n    \"\"\"Analyze Excel file structure and identify target\
    \ sheets.\"\"\"\n    print(\"\U0001F50D Analyzing Excel file structure...\")\n\
    \    \n    try:\n        excel_file = pd.ExcelFile(io.BytesIO(file_content))\n\
    \        available_sheets = excel_file.sheet_names\n        \n        print(f\"\
    \u2705 Found {len(available_sheets)} sheets:\")\n        for i, sheet in enumerate(available_sheets,\
    \ 1):\n            print(f\"  {i}. {sheet}\")\n        \n        # Match target\
    \ sheets with available sheets (case-insensitive)\n        matched_sheets = []\n\
    \        for available_sheet in available_sheets:\n            for target_sheet\
    \ in target_sheets:\n                if available_sheet.lower().strip() == target_sheet.lower().strip():\n\
    \                    matched_sheets.append({\n                        'sheet_name':\
    \ available_sheet,\n                        'target_name': target_sheet\n    \
    \                })\n                    break\n        \n        print(f\"\\\
    n\U0001F3AF Target sheets found: {len(matched_sheets)}\")\n        for sheet in\
    \ matched_sheets:\n            print(f\"  \u2713 {sheet['sheet_name']} (matches\
    \ '{sheet['target_name']}')\")\n        \n        return {\n            'excel_file':\
    \ excel_file,\n            'available_sheets': available_sheets,\n           \
    \ 'matched_sheets': matched_sheets\n        }\n        \n    except Exception\
    \ as e:\n        raise Exception(f\"Failed to analyze Excel structure: {str(e)}\"\
    )\n\ndef extract_sheet_data(excel_file, sheet_name: str, analysis_depth: str)\
    \ -> Dict[str, Any]:\n    \"\"\"Extract data from a specific sheet based on analysis\
    \ depth.\"\"\"\n    print(f\"\U0001F4CA Analyzing sheet: {sheet_name}\")\n   \
    \ \n    try:\n        df = pd.read_excel(excel_file, sheet_name=sheet_name, header=None)\n\
    \        \n        if df.empty:\n            print(f\"  \u26A0\uFE0F Sheet '{sheet_name}'\
    \ is empty\")\n            return {\n                'sheet_name': sheet_name,\n\
    \                'status': 'empty',\n                'values_extracted': [],\n\
    \                'total_values': 0,\n                'dimensions': {'rows': 0,\
    \ 'columns': 0}\n            }\n        \n        print(f\"  \U0001F4CF Dimensions:\
    \ {df.shape[0]} rows \xD7 {df.shape[1]} columns\")\n        \n        # Extract\
    \ data based on analysis depth\n        extracted_values = []\n        \n    \
    \    if analysis_depth == \"basic\":\n            # Basic: Only first column,\
    \ first 10 rows\n            if len(df.columns) > 0:\n                col_values\
    \ = df.iloc[:10, 0].dropna().astype(str).str.strip()\n                extracted_values.extend([v\
    \ for v in col_values if v and not v.startswith('Unnamed:')])\n              \
    \  \n        elif analysis_depth == \"comprehensive\":\n            # Comprehensive:\
    \ First 2 columns, all rows\n            for col_idx in range(min(2, len(df.columns))):\n\
    \                col_values = df.iloc[:, col_idx].dropna().astype(str).str.strip()\n\
    \                extracted_values.extend([v for v in col_values if v and not v.startswith('Unnamed:')])\n\
    \                \n        else:  # detailed\n            # Detailed: First 3\
    \ columns, all rows, plus data type info\n            for col_idx in range(min(3,\
    \ len(df.columns))):\n                col_values = df.iloc[:, col_idx].dropna().astype(str).str.strip()\n\
    \                extracted_values.extend([v for v in col_values if v and not v.startswith('Unnamed:')])\n\
    \        \n        # Remove duplicates while preserving order\n        unique_values\
    \ = []\n        seen = set()\n        for value in extracted_values:\n       \
    \     if value not in seen:\n                seen.add(value)\n               \
    \ unique_values.append(value)\n        \n        print(f\"  \u2705 Extracted {len(unique_values)}\
    \ unique values\")\n        \n        result = {\n            'sheet_name': sheet_name,\n\
    \            'status': 'success',\n            'values_extracted': unique_values,\n\
    \            'total_values': len(unique_values),\n            'dimensions': {'rows':\
    \ df.shape[0], 'columns': df.shape[1]},\n            'analysis_depth': analysis_depth\n\
    \        }\n        \n        if analysis_depth == \"detailed\":\n           \
    \ # Add data type analysis for detailed mode\n            data_types = {}\n  \
    \          for col_idx in range(min(3, len(df.columns))):\n                col_data\
    \ = df.iloc[:, col_idx].dropna()\n                if len(col_data) > 0:\n    \
    \                data_types[f\"column_{col_idx}\"] = {\n                     \
    \   \"data_type\": str(col_data.dtype),\n                        \"sample_values\"\
    : col_data.head(3).tolist()\n                    }\n            result['data_type_analysis']\
    \ = data_types\n        \n        return result\n        \n    except Exception\
    \ as e:\n        print(f\"  \u274C Error analyzing sheet '{sheet_name}': {str(e)}\"\
    )\n        return {\n            'sheet_name': sheet_name,\n            'status':\
    \ 'error',\n            'error': str(e),\n            'total_values': 0\n    \
    \    }\n\ndef process_all_sheets(excel_analysis: Dict[str, Any], analysis_depth:\
    \ str) -> Dict[str, Any]:\n    \"\"\"Process all matched sheets and compile results.\"\
    \"\"\n    print(\"\\n\U0001F4DA Processing all target sheets...\")\n    print(\"\
    =\" * 40)\n    \n    excel_file = excel_analysis['excel_file']\n    matched_sheets\
    \ = excel_analysis['matched_sheets']\n    \n    if not matched_sheets:\n     \
    \   print(\"\u26A0\uFE0F No target sheets found to process\")\n        return\
    \ {\n            'processed_sheets': [],\n            'total_values_extracted':\
    \ 0,\n            'processing_errors': [],\n            'success_count': 0\n \
    \       }\n    \n    processed_sheets = []\n    total_values = 0\n    processing_errors\
    \ = []\n    success_count = 0\n    \n    for i, sheet_info in enumerate(matched_sheets,\
    \ 1):\n        print(f\"\\n\U0001F504 Processing sheet {i}/{len(matched_sheets)}:\
    \ {sheet_info['sheet_name']}\")\n        \n        sheet_result = extract_sheet_data(excel_file,\
    \ sheet_info['sheet_name'], analysis_depth)\n        processed_sheets.append(sheet_result)\n\
    \        \n        if sheet_result['status'] == 'success':\n            total_values\
    \ += sheet_result['total_values']\n            success_count += 1\n        elif\
    \ sheet_result['status'] == 'error':\n            processing_errors.append(sheet_result)\n\
    \    \n    print(f\"\\n\u2705 Processing complete!\")\n    print(f\"   Sheets\
    \ processed: {len(processed_sheets)}\")\n    print(f\"   Successful: {success_count}\"\
    )\n    print(f\"   Errors: {len(processing_errors)}\")\n    print(f\"   Total\
    \ values extracted: {total_values}\")\n    \n    return {\n        'processed_sheets':\
    \ processed_sheets,\n        'total_values_extracted': total_values,\n       \
    \ 'processing_errors': processing_errors,\n        'success_count': success_count\n\
    \    }\n\n# =============================================================================\n\
    # RESULTS UPLOAD\n# =============================================================================\n\
    def upload_results_to_drive(results: Dict[str, Any], file_id: str, file_info:\
    \ Dict[str, Any], \n                           config: Dict[str, Any], access_token:\
    \ str) -> Dict[str, Any]:\n    \"\"\"Upload comprehensive results to Google Drive\
    \ as JSON file.\"\"\"\n    print(\"\\n\U0001F4E4 Uploading results to Google Drive...\"\
    )\n    print(\"=\" * 40)\n    \n    # Create comprehensive results structure\n\
    \    comprehensive_results = {\n        \"metadata\": {\n            \"workflow_id\"\
    : \"excel_single_file_analysis_clean\",\n            \"execution_timestamp\":\
    \ datetime.now().isoformat(),\n            \"analysis_depth\": config['analysis_depth'],\n\
    \            \"source_file_id\": file_id,\n            \"target_sheets_requested\"\
    : config['target_sheets']\n        },\n        \"execution_summary\": {\n    \
    \        \"file_id\": file_id,\n            \"file_name\": file_info.get('name',\
    \ 'Unknown'),\n            \"file_size\": file_info.get('size', 'N/A'),\n    \
    \        \"sheets_processed\": len(results['processed_sheets']),\n           \
    \ \"successful_sheets\": results['success_count'],\n            \"failed_sheets\"\
    : len(results['processing_errors']),\n            \"total_values_extracted\":\
    \ results['total_values_extracted'],\n            \"processing_timestamp\": datetime.now().isoformat()\n\
    \        },\n        \"detailed_results\": {\n            \"sheet_analysis\":\
    \ results['processed_sheets'],\n            \"processing_errors\": results['processing_errors']\n\
    \        },\n        \"source_file_metadata\": file_info\n    }\n    \n    # Convert\
    \ to JSON\n    json_content = json.dumps(comprehensive_results, indent=2, ensure_ascii=False)\n\
    \    print(f\"\U0001F4C4 Generated results JSON ({len(json_content)} characters)\"\
    )\n    \n    # Create filename with timestamp\n    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')\n\
    \    filename = f\"{config['results_file_name']}_{timestamp}.json\"\n    \n  \
    \  try:\n        # Upload to Google Drive using multipart upload\n        boundary\
    \ = '-------314159265358979323846'\n        delimiter = f'\\r\\n--{boundary}\\\
    r\\n'\n        close_delim = f'\\r\\n--{boundary}--'\n        \n        metadata\
    \ = {\n            'name': filename,\n            'parents': []  # Upload to root\
    \ folder\n        }\n        \n        body = delimiter + 'Content-Type: application/json\\\
    r\\n\\r\\n'\n        body += json.dumps(metadata)\n        body += delimiter +\
    \ 'Content-Type: application/json\\r\\n\\r\\n'\n        body += json_content\n\
    \        body += close_delim\n        \n        headers = {\n            'Authorization':\
    \ f'Bearer {access_token}',\n            'Content-Type': f'multipart/related;\
    \ boundary=\"{boundary}\"'\n        }\n        \n        upload_url = 'https://www.googleapis.com/upload/drive/v3/files?uploadType=multipart'\n\
    \        response = requests.post(upload_url, headers=headers, data=body, timeout=60)\n\
    \        response.raise_for_status()\n        \n        upload_result = response.json()\n\
    \        uploaded_file_id = upload_result['id']\n        file_url = f\"https://drive.google.com/file/d/{uploaded_file_id}/view\"\
    \n        \n        print(\"\u2705 Results uploaded successfully!\")\n       \
    \ print(f\"   File ID: {uploaded_file_id}\")\n        print(f\"   Filename: {filename}\"\
    )\n        print(f\"   URL: {file_url}\")\n        \n        return {\n      \
    \      'upload_success': True,\n            'uploaded_file_id': uploaded_file_id,\n\
    \            'filename': filename,\n            'file_url': file_url,\n      \
    \      'file_size_bytes': len(json_content)\n        }\n        \n    except Exception\
    \ as e:\n        print(f\"\u274C Failed to upload results: {str(e)}\")\n     \
    \   return {\n            'upload_success': False,\n            'error': str(e)\n\
    \        }\n\n# =============================================================================\n\
    # MAIN EXECUTION\n# =============================================================================\n\
    def main():\n    \"\"\"Main execution function.\"\"\"\n    try:\n        # Get\
    \ configuration\n        config = get_configuration()\n        \n        # Validate\
    \ file ID\n        if not config['file_id'] or len(config['file_id']) < 10:\n\
    \            raise Exception(\"Invalid file ID provided\")\n        \n       \
    \ # Authenticate with Google Drive\n        access_token = authenticate_google_drive(config['nango_connection_id'],\
    \ config['nango_key'])\n        \n        # Get file information\n        file_info\
    \ = get_file_info(config['file_id'], access_token)\n        \n        # Download\
    \ Excel file\n        file_content = download_excel_file(config['file_id'], file_info,\
    \ access_token)\n        \n        # Analyze Excel structure\n        excel_analysis\
    \ = analyze_excel_structure(file_content, config['target_sheets'])\n        \n\
    \        # Process all sheets\n        processing_results = process_all_sheets(excel_analysis,\
    \ config['analysis_depth'])\n        \n        # Upload results to Google Drive\n\
    \        upload_results = upload_results_to_drive(\n            processing_results,\
    \ config['file_id'], file_info, config, access_token\n        )\n        \n  \
    \      # Compile final output\n        final_results = {\n            'status':\
    \ 'success',\n            'file_id': config['file_id'],\n            'file_name':\
    \ file_info.get('name', 'Unknown'),\n            'analysis_depth': config['analysis_depth'],\n\
    \            'sheets_processed': len(processing_results['processed_sheets']),\n\
    \            'successful_sheets': processing_results['success_count'],\n     \
    \       'total_values_extracted': processing_results['total_values_extracted'],\n\
    \            'upload_success': upload_results['upload_success'],\n           \
    \ 'uploaded_file_id': upload_results.get('uploaded_file_id'),\n            'results_file_url':\
    \ upload_results.get('file_url'),\n            'results_filename': upload_results.get('filename'),\n\
    \            'processing_timestamp': datetime.now().isoformat()\n        }\n \
    \       \n        if not upload_results['upload_success']:\n            final_results['upload_error']\
    \ = upload_results.get('error')\n        \n        print(\"\\n\" + \"=\" * 70)\n\
    \        print(\"\U0001F389 EXCEL ANALYSIS COMPLETE!\")\n        print(\"=\" *\
    \ 70)\n        print(f\"\u2705 File: {final_results['file_name']}\")\n       \
    \ print(f\"\U0001F4CA Sheets processed: {final_results['sheets_processed']}\"\
    )\n        print(f\"\U0001F4C8 Values extracted: {final_results['total_values_extracted']}\"\
    )\n        \n        if final_results['upload_success']:\n            print(f\"\
    \U0001F4E4 Results uploaded: {final_results['results_filename']}\")\n        \
    \    print(f\"\U0001F517 Access at: {final_results['results_file_url']}\")\n \
    \       else:\n            print(f\"\u26A0\uFE0F Upload failed: {final_results.get('upload_error',\
    \ 'Unknown error')}\")\n        \n        return final_results\n        \n   \
    \ except Exception as e:\n        error_result = {\n            'status': 'error',\n\
    \            'error': str(e),\n            'file_id': config.get('file_id', 'unknown')\
    \ if 'config' in locals() else 'unknown',\n            'sheets_processed': 0,\n\
    \            'total_values_extracted': 0,\n            'upload_success': False,\n\
    \            'processing_timestamp': datetime.now().isoformat()\n        }\n \
    \       \n        print(f\"\\n\u274C ANALYSIS FAILED: {str(e)}\")\n        return\
    \ error_result\n\n# Execute main function and output results\nresults = main()\n\
    print(f\"\\n__OUTPUTS__ {json.dumps(results)}\")\n"
  description: Complete Excel analysis workflow in a single readable script
  requirements:
  - pandas
  - openpyxl
  - requests
  timeout_seconds: 300
- id: download_json_results
  name: Download JSON Results from Drive
  type: script
  script: "import json\nimport requests\nfrom typing import Dict, List, Any\n\ndef\
    \ authenticate_google_drive(nango_connection_id: str, nango_key: str) -> str:\n\
    \    \"\"\"Authenticate with Google Drive and return access token.\"\"\"\n   \
    \ print(\"\U0001F510 Authenticating with Google Drive...\")\n    \n    auth_url\
    \ = f\"https://auth-dev.assistents.ai/connection/{nango_connection_id}?provider_config_key=google-drive-hq3h\"\
    \n    headers = {\n        'Authorization': f'Bearer {nango_key}',\n        'Content-Type':\
    \ 'application/json'\n    }\n    \n    try:\n        response = requests.get(auth_url,\
    \ headers=headers, timeout=15)\n        response.raise_for_status()\n        \n\
    \        auth_data = response.json()\n        access_token = auth_data.get('credentials',\
    \ {}).get('access_token')\n        \n        if not access_token:\n          \
    \  raise Exception(\"No access token received from authentication\")\n       \
    \     \n        print(\"\u2705 Authentication successful\")\n        return access_token\n\
    \        \n    except Exception as e:\n        raise Exception(f\"Authentication\
    \ failed: {str(e)}\")\n\ndef download_json_file(file_id: str, access_token: str)\
    \ -> Dict[str, Any]:\n    \"\"\"Download JSON file from Google Drive.\"\"\"\n\
    \    print(\"\U0001F4E5 Downloading JSON results file...\")\n    \n    headers\
    \ = {'Authorization': f'Bearer {access_token}'}\n    url = f\"https://www.googleapis.com/drive/v3/files/{file_id}\"\
    \n    params = {'alt': 'media'}\n    \n    try:\n        response = requests.get(url,\
    \ headers=headers, params=params, timeout=60)\n        response.raise_for_status()\n\
    \        \n        json_data = response.json()\n        print(\"\u2705 JSON file\
    \ downloaded and parsed successfully\")\n        return json_data\n        \n\
    \    except Exception as e:\n        raise Exception(f\"Failed to download JSON\
    \ file: {str(e)}\")\n\ndef main():\n    try:\n        # Get configuration\n  \
    \      nango_connection_id = \"${nango_connection_id}\"\n        nango_key = \"\
    ${nango_key}\"\n        results_file_id = \"${analyze_excel_file.uploaded_file_id}\"\
    \n        \n        print(\"\U0001F4CB DOWNLOADING JSON RESULTS\")\n        print(\"\
    =\" * 50)\n        print(f\"File ID: {results_file_id}\")\n        \n        #\
    \ Authenticate and download\n        access_token = authenticate_google_drive(nango_connection_id,\
    \ nango_key)\n        json_data = download_json_file(results_file_id, access_token)\n\
    \        \n        # Extract key information\n        execution_summary = json_data.get('execution_summary',\
    \ {})\n        detailed_results = json_data.get('detailed_results', {})\n    \
    \    sheet_analysis = detailed_results.get('sheet_analysis', [])\n        \n \
    \       print(f\"\U0001F4CA Found {len(sheet_analysis)} sheets to process\")\n\
    \        \n        result = {\n            'status': 'success',\n            'json_data':\
    \ json_data,\n            'sheet_count': len(sheet_analysis),\n            'sheets_available':\
    \ [sheet['sheet_name'] for sheet in sheet_analysis if sheet['status'] == 'success'],\n\
    \            'total_values': sum(sheet['total_values'] for sheet in sheet_analysis\
    \ if sheet['status'] == 'success'),\n            'source_file_name': execution_summary.get('file_name',\
    \ 'Unknown')\n        }\n        \n        print(f\"\u2705 Ready to process {result['sheet_count']}\
    \ sheets\")\n        print(f\"\U0001F4DD Total values extracted: {result['total_values']}\"\
    )\n        \n        return result\n        \n    except Exception as e:\n   \
    \     return {\n            'status': 'error',\n            'error': str(e),\n\
    \            'sheet_count': 0,\n            'total_values': 0\n        }\n\n#\
    \ Execute and output results\nresults = main()\nprint(f\"\\n__OUTPUTS__ {json.dumps(results)}\"\
    )\n"
  depends_on:
  - analyze_excel_file
  description: Download and parse the JSON results file
  requirements:
  - requests
  timeout_seconds: 120
- id: create_dynamic_models_and_process
  name: Create Dynamic Pydantic Models and Process with Gemini
  type: script
  script: "import json\nimport time\nimport io\nfrom datetime import datetime\nfrom\
    \ typing import Dict, List, Any, Optional\nfrom pydantic import BaseModel, Field,\
    \ create_model\nimport google.generativeai as genai\nimport requests\nimport pandas\
    \ as pd\nfrom openpyxl import Workbook\nfrom openpyxl.styles import PatternFill,\
    \ Font, Alignment, Border, Side\nfrom openpyxl.utils import get_column_letter\n\
    \ndef create_dynamic_pydantic_model(sheet_name: str, values_extracted: List[str]):\n\
    \    \"\"\"Create a dynamic Pydantic model based on sheet name and extracted values.\"\
    \"\"\n    print(f\"\U0001F4DD Creating model for sheet: {sheet_name}\")\n    \n\
    \    # Clean sheet name for class name\n    class_name = sheet_name.replace('\
    \ ', '').replace('&', 'And').replace('-', '').replace('/', '')\n    class_name\
    \ = f\"{class_name}Model\"\n    \n    # Create field definitions\n    field_definitions\
    \ = {}\n    for value in values_extracted:\n        # Clean field name\n     \
    \   field_name = value.lower().replace(' ', '_').replace('&', 'and').replace('-',\
    \ '_')\n        field_name = field_name.replace('/', '_').replace('(', '').replace(')',\
    \ '')\n        field_name = field_name.replace('%', 'percent').replace('.', '').replace(',',\
    \ '')\n        field_name = ''.join(c for c in field_name if c.isalnum() or c\
    \ == '_')\n        \n        # Ensure field name starts with letter\n        if\
    \ field_name and field_name[0].isdigit():\n            field_name = f\"field_{field_name}\"\
    \n        \n        if field_name:  # Only add if we have a valid field name\n\
    \            field_definitions[field_name] = (str, Field(description=value))\n\
    \    \n    # Create the dynamic model\n    model = create_model(class_name, **field_definitions)\n\
    \    \n    print(f\"\u2705 Created model '{class_name}' with {len(field_definitions)}\
    \ fields\")\n    return model, class_name\n\ndef setup_gemini(api_key: str):\n\
    \    \"\"\"Setup Gemini API client.\"\"\"\n    genai.configure(api_key=api_key)\n\
    \    return genai.GenerativeModel('gemini-1.5-flash')\n\ndef process_sheet_with_gemini(model,\
    \ sheet_data: Dict, dynamic_model, model_name: str, analysis_year: str) -> Optional[Dict]:\n\
    \    \"\"\"Process sheet data with Gemini using dynamic model.\"\"\"\n    try:\n\
    \        sheet_name = sheet_data['sheet_name']\n        values_extracted = sheet_data['values_extracted']\n\
    \        \n        print(f\"\U0001F916 Processing {sheet_name} for {analysis_year}\
    \ with Gemini...\")\n        \n        # Create prompt for financial data extraction\n\
    \        prompt = f\"\"\"\n        You are a financial data extraction specialist.\
    \ I need you to extract financial data for {analysis_year} from the following\
    \ sheet: {sheet_name}.\n\n        Available data points from the sheet:\n    \
    \    {json.dumps(values_extracted, indent=2)}\n\n        Please extract the financial\
    \ values for {analysis_year} and structure them according to the provided schema.\n\
    \        \n        IMPORTANT INSTRUCTIONS:\n        1. Focus specifically on {analysis_year}\
    \ data\n        2. All monetary figures should be in the original units found\
    \ in the data\n        3. If a value is not found or not applicable, use an empty\
    \ string \"\"\n        4. For ratios and percentages, include the unit (e.g.,\
    \ \"2.5x\", \"15%\")\n        5. Be accurate and thorough in extraction\n    \
    \    6. Convert all values to strings for consistency\n\n        Extract the data\
    \ for sheet: {sheet_name}\n        \"\"\"\n\n        # Generate content with Gemini\n\
    \        response = model.generate_content(\n            prompt,\n           \
    \ generation_config=genai.types.GenerationConfig(\n                response_mime_type=\"\
    application/json\",\n                response_schema=dynamic_model\n         \
    \   )\n        )\n        \n        result = json.loads(response.text)\n     \
    \   print(f\"\u2705 Successfully processed {sheet_name} for {analysis_year}\"\
    )\n        return result\n        \n    except Exception as e:\n        print(f\"\
    \u274C Error processing {sheet_name} for {analysis_year}: {str(e)}\")\n      \
    \  return None\n\ndef create_comprehensive_excel(processed_data: Dict[str, Dict[str,\
    \ Dict]], source_file_name: str) -> bytes:\n    \"\"\"Create comprehensive Excel\
    \ workbook with processed data.\"\"\"\n    print(\"\U0001F4CA Creating comprehensive\
    \ Excel workbook...\")\n    \n    wb = Workbook()\n    wb.remove(wb.active)  #\
    \ Remove default sheet\n    \n    # Define styles\n    header_fill = PatternFill(start_color=\"\
    366092\", end_color=\"366092\", fill_type=\"solid\")\n    header_font = Font(bold=True,\
    \ color=\"FFFFFF\", size=11)\n    year_fill = PatternFill(start_color=\"DCE6F1\"\
    , end_color=\"DCE6F1\", fill_type=\"solid\")\n    year_font = Font(bold=True,\
    \ size=10)\n    data_font = Font(size=10)\n    center_align = Alignment(horizontal='center',\
    \ vertical='center')\n    left_align = Alignment(horizontal='left', vertical='center')\n\
    \    right_align = Alignment(horizontal='right', vertical='center')\n    \n  \
    \  thin_border = Border(\n        left=Side(style='thin'), right=Side(style='thin'),\n\
    \        top=Side(style='thin'), bottom=Side(style='thin')\n    )\n    \n    #\
    \ Process each sheet\n    for sheet_name, year_data in processed_data.items():\n\
    \        if not year_data:\n            continue\n            \n        ws = wb.create_sheet(title=sheet_name[:31])\n\
    \        \n        # Get all unique metrics across all years\n        all_metrics\
    \ = set()\n        for year, data in year_data.items():\n            if data:\n\
    \                all_metrics.update(data.keys())\n        \n        all_metrics\
    \ = sorted(list(all_metrics))\n        analysis_years = sorted(year_data.keys())\n\
    \        \n        # Create header row\n        headers = ['Metric'] + analysis_years\n\
    \        ws.append(headers)\n        \n        # Style header row\n        for\
    \ col_idx, header in enumerate(headers, 1):\n            cell = ws.cell(row=1,\
    \ column=col_idx)\n            cell.fill = header_fill\n            cell.font\
    \ = header_font\n            cell.alignment = center_align\n            cell.border\
    \ = thin_border\n        \n        # Add data rows\n        for metric in all_metrics:\n\
    \            row_data = [metric.replace('_', ' ').title()]\n            \n   \
    \         for year in analysis_years:\n                value = year_data.get(year,\
    \ {}).get(metric, \"\")\n                row_data.append(value)\n            \n\
    \            ws.append(row_data)\n            \n            # Style the row\n\
    \            current_row = ws.max_row\n            for col_idx in range(1, len(headers)\
    \ + 1):\n                cell = ws.cell(row=current_row, column=col_idx)\n   \
    \             cell.border = thin_border\n                \n                if\
    \ col_idx == 1:  # Metric name column\n                    cell.font = year_font\n\
    \                    cell.alignment = left_align\n                else:  # Data\
    \ columns\n                    cell.font = data_font\n                    cell.alignment\
    \ = right_align\n        \n        # Adjust column widths\n        ws.column_dimensions['A'].width\
    \ = 40\n        for col_idx in range(2, len(headers) + 1):\n            ws.column_dimensions[get_column_letter(col_idx)].width\
    \ = 15\n        \n        # Freeze panes\n        ws.freeze_panes = 'B2'\n   \
    \     \n        print(f\"\u2705 Created sheet: {sheet_name} with {len(all_metrics)}\
    \ metrics\")\n    \n    # Save to bytes\n    excel_buffer = io.BytesIO()\n   \
    \ wb.save(excel_buffer)\n    excel_buffer.seek(0)\n    \n    print(f\"\U0001F4C1\
    \ Excel workbook created with {len(processed_data)} sheets\")\n    return excel_buffer.getvalue()\n\
    \ndef upload_excel_to_drive(excel_content: bytes, filename: str, access_token:\
    \ str) -> Dict[str, Any]:\n    \"\"\"Upload Excel file to Google Drive.\"\"\"\n\
    \    print(\"\U0001F4E4 Uploading Excel file to Google Drive...\")\n    \n   \
    \ try:\n        boundary = '-------314159265358979323846'\n        delimiter =\
    \ f'\\r\\n--{boundary}\\r\\n'\n        close_delim = f'\\r\\n--{boundary}--'\n\
    \        \n        metadata = {\n            'name': filename,\n            'parents':\
    \ []  # Upload to root folder\n        }\n        \n        body = delimiter +\
    \ 'Content-Type: application/json\\r\\n\\r\\n'\n        body += json.dumps(metadata)\n\
    \        body += delimiter + 'Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet\\\
    r\\n\\r\\n'\n        body += excel_content.decode('latin-1')\n        body +=\
    \ close_delim\n        \n        headers = {\n            'Authorization': f'Bearer\
    \ {access_token}',\n            'Content-Type': f'multipart/related; boundary=\"\
    {boundary}\"'\n        }\n        \n        upload_url = 'https://www.googleapis.com/upload/drive/v3/files?uploadType=multipart'\n\
    \        response = requests.post(upload_url, headers=headers, data=body.encode('latin-1'),\
    \ timeout=120)\n        response.raise_for_status()\n        \n        upload_result\
    \ = response.json()\n        file_url = f\"https://drive.google.com/file/d/{upload_result['id']}/view\"\
    \n        \n        print(\"\u2705 Excel file uploaded successfully!\")\n    \
    \    return {\n            'success': True,\n            'file_id': upload_result['id'],\n\
    \            'file_url': file_url\n        }\n        \n    except Exception as\
    \ e:\n        print(f\"\u274C Failed to upload Excel file: {str(e)}\")\n     \
    \   return {\n            'success': False,\n            'error': str(e)\n   \
    \     }\n\ndef authenticate_google_drive(nango_connection_id: str, nango_key:\
    \ str) -> str:\n    \"\"\"Authenticate with Google Drive.\"\"\"\n    auth_url\
    \ = f\"https://auth-dev.assistents.ai/connection/{nango_connection_id}?provider_config_key=google-drive-hq3h\"\
    \n    headers = {\n        'Authorization': f'Bearer {nango_key}',\n        'Content-Type':\
    \ 'application/json'\n    }\n    \n    response = requests.get(auth_url, headers=headers,\
    \ timeout=15)\n    response.raise_for_status()\n    \n    auth_data = response.json()\n\
    \    return auth_data['credentials']['access_token']\n\ndef main():\n    try:\n\
    \        # Get configuration\n        gemini_api_key = \"${gemini_api_key}\"\n\
    \        analysis_years = [year.strip() for year in \"${analysis_years}\".split(',')]\n\
    \        json_data = ${download_json_results.json_data}\n        source_file_name\
    \ = \"${download_json_results.source_file_name}\"\n        nango_connection_id\
    \ = \"${nango_connection_id}\"\n        nango_key = \"${nango_key}\"\n       \
    \ \n        print(\"\U0001F680 PROCESSING WITH DYNAMIC MODELS AND GEMINI\")\n\
    \        print(\"=\" * 60)\n        print(f\"Analysis Years: {analysis_years}\"\
    )\n        print(f\"Source File: {source_file_name}\")\n        \n        # Setup\
    \ Gemini\n        model = setup_gemini(gemini_api_key)\n        \n        # Get\
    \ sheet analysis data\n        sheet_analysis = json_data['detailed_results']['sheet_analysis']\n\
    \        successful_sheets = [sheet for sheet in sheet_analysis if sheet['status']\
    \ == 'success']\n        \n        print(f\"\U0001F4CA Processing {len(successful_sheets)}\
    \ successful sheets\")\n        \n        # Process each sheet for each analysis\
    \ year\n        processed_data = {}\n        \n        for sheet_data in successful_sheets:\n\
    \            sheet_name = sheet_data['sheet_name']\n            values_extracted\
    \ = sheet_data['values_extracted']\n            \n            if not values_extracted:\n\
    \                print(f\"\u26A0\uFE0F Skipping {sheet_name} - no values extracted\"\
    )\n                continue\n            \n            print(f\"\\n\U0001F4CB\
    \ Processing sheet: {sheet_name}\")\n            \n            # Create dynamic\
    \ Pydantic model\n            dynamic_model, model_name = create_dynamic_pydantic_model(sheet_name,\
    \ values_extracted)\n            \n            # Process for each analysis year\n\
    \            year_results = {}\n            for analysis_year in analysis_years:\n\
    \                result = process_sheet_with_gemini(\n                    model,\
    \ sheet_data, dynamic_model, model_name, analysis_year\n                )\n  \
    \              if result:\n                    year_results[analysis_year] = result\n\
    \                \n                # Rate limiting\n                time.sleep(2)\n\
    \            \n            if year_results:\n                processed_data[sheet_name]\
    \ = year_results\n                print(f\"\u2705 Completed processing {sheet_name}\
    \ for {len(year_results)} years\")\n            \n            # Delay between\
    \ sheets\n            time.sleep(1)\n        \n        if not processed_data:\n\
    \            raise Exception(\"No data was successfully processed\")\n       \
    \ \n        # Create Excel workbook\n        excel_content = create_comprehensive_excel(processed_data,\
    \ source_file_name)\n        \n        # Upload to Google Drive\n        access_token\
    \ = authenticate_google_drive(nango_connection_id, nango_key)\n        timestamp\
    \ = datetime.now().strftime('%Y%m%d_%H%M%S')\n        filename = f\"Financial_Analysis_{source_file_name}_{timestamp}.xlsx\"\
    \n        \n        upload_result = upload_excel_to_drive(excel_content, filename,\
    \ access_token)\n        \n        # Compile final results\n        final_results\
    \ = {\n            'status': 'success',\n            'sheets_processed': len(processed_data),\n\
    \            'analysis_years': analysis_years,\n            'source_file': source_file_name,\n\
    \            'excel_created': upload_result['success'],\n            'excel_file_id':\
    \ upload_result.get('file_id'),\n            'excel_file_url': upload_result.get('file_url'),\n\
    \            'excel_filename': filename,\n            'processing_timestamp':\
    \ datetime.now().isoformat(),\n            'total_extractions': sum(len(year_data)\
    \ for year_data in processed_data.values())\n        }\n        \n        print(\"\
    \\n\" + \"=\" * 70)\n        print(\"\U0001F389 FINANCIAL ANALYSIS COMPLETE!\"\
    )\n        print(\"=\" * 70)\n        print(f\"\u2705 Sheets processed: {final_results['sheets_processed']}\"\
    )\n        print(f\"\U0001F4C5 Analysis years: {len(analysis_years)}\")\n    \
    \    print(f\"\U0001F522 Total extractions: {final_results['total_extractions']}\"\
    )\n        \n        if final_results['excel_created']:\n            print(f\"\
    \U0001F4CA Excel file created: {filename}\")\n            print(f\"\U0001F517\
    \ Access at: {final_results['excel_file_url']}\")\n        else:\n           \
    \ print(f\"\u26A0\uFE0F Excel upload failed: {upload_result.get('error', 'Unknown\
    \ error')}\")\n        \n        return final_results\n        \n    except Exception\
    \ as e:\n        return {\n            'status': 'error',\n            'error':\
    \ str(e),\n            'sheets_processed': 0,\n            'excel_created': False,\n\
    \            'processing_timestamp': datetime.now().isoformat()\n        }\n\n\
    # Execute main function and output results\nresults = main()\nprint(f\"\\n__OUTPUTS__\
    \ {json.dumps(results)}\")\n"
  depends_on:
  - download_json_results
  description: Create dynamic Pydantic models from JSON and process with Gemini for
    each analysis year
  requirements:
  - pydantic
  - google-generativeai
  - requests
  - openpyxl
  - pandas
  timeout_seconds: 600
inputs:
- name: nango_connection_id
  type: string
  default: 4274993f-c614-4efa-a01e-8d07422f4b09
  required: true
  description: Nango connection ID for Google Drive access
- name: nango_key
  type: string
  default: 8df3e2de-2307-48d3-94bd-ddd3fd6a62ec
  required: true
  description: Nango authentication key
- name: file_id
  type: string
  default: 1Glz-mTMDCuGzUGoTLWOrSyIMI1sJDYlA
  required: true
  description: Google Drive file ID of the Excel file to analyze
- name: analysis_depth
  type: string
  default: comprehensive
  required: false
  description: 'Analysis depth: basic, comprehensive, or detailed'
- name: target_sheets
  type: string
  default: Volumes,Power & Fuel,Per tonne,P&L,Operating metrics,Cash Flow,Balance
    Sheet,Financial Ratios,Valuation
  required: false
  description: Comma-separated list of sheet names to analyze
- name: results_file_name
  type: string
  default: Excel_Analysis_Results
  required: false
  description: Name for the results file to upload to Google Drive
- name: analysis_years
  type: string
  default: FY2024
  required: false
  description: Comma-separated list of analysis years to process
- name: gemini_api_key
  type: string
  default: AIzaSyB0_e6aU4gF-qRapMm3UYBSITpbd0ehsYk
  required: true
  description: Gemini API key for AI processing
outputs:
  excel_created:
    type: boolean
    source: create_dynamic_models_and_process.excel_created
    description: Whether Excel file was successfully created
  final_excel_url:
    type: string
    source: create_dynamic_models_and_process.excel_file_url
    description: Direct link to final Excel analysis file
  sheets_available:
    type: integer
    source: download_json_results.sheet_count
    description: Number of sheets available for processing
  sheets_processed:
    type: integer
    source: create_dynamic_models_and_process.sheets_processed
    description: Number of sheets successfully processed
  processing_status:
    type: string
    source: create_dynamic_models_and_process.status
    description: Overall processing status
  total_extractions:
    type: integer
    source: create_dynamic_models_and_process.total_extractions
    description: Total number of data extractions performed
  final_excel_file_id:
    type: string
    source: create_dynamic_models_and_process.excel_file_id
    description: Google Drive file ID of final Excel analysis
  json_download_status:
    type: string
    source: download_json_results.status
    description: Status of JSON file download
version: '2.0'
namespace: excel_analysis
description: Simplified workflow to analyze a single Excel file and upload results
  to Google Drive

Execution ID	Status	Started	Duration	Actions
`7a8e8bb9...`	COMPLETED	2025-07-31 18:27:43	N/A	View
`25eeb35a...`	COMPLETED	2025-07-31 18:24:53	N/A	View
`8128a06a...`	COMPLETED	2025-07-31 18:07:49	N/A	View
`29ad898b...`	COMPLETED	2025-07-31 17:01:50	N/A	View
`a99889a4...`	COMPLETED	2025-07-31 11:06:42	N/A	View
`cba6488d...`	COMPLETED	2025-07-31 11:02:07	N/A	View
`8a67cfa3...`	COMPLETED	2025-07-31 10:53:23	N/A	View
`f07e88a6...`	COMPLETED	2025-07-31 10:35:46	N/A	View
`7d0cb621...`	COMPLETED	2025-07-31 10:09:01	N/A	View
`997f4416...`	COMPLETED	2025-07-31 09:33:18	N/A	View
`bc7419e9...`	COMPLETED	2025-07-31 09:27:11	N/A	View
`f46a2948...`	COMPLETED	2025-07-31 09:12:08	N/A	View
`ccb4fe69...`	COMPLETED	2025-07-31 09:09:30	N/A	View
`6a4aed8b...`	COMPLETED	2025-07-31 07:51:28	N/A	View
`e46f3152...`	COMPLETED	2025-07-31 07:47:29	N/A	View
`ebe186d3...`	COMPLETED	2025-07-31 07:45:12	N/A	View
`a818f809...`	COMPLETED	2025-07-31 07:43:17	N/A	View
`c615b214...`	COMPLETED	2025-07-31 07:40:16	N/A	View
`f7a94b6a...`	COMPLETED	2025-07-31 07:38:43	N/A	View
`c4920a09...`	COMPLETED	2025-07-31 07:36:09	N/A	View