WIP field indexing

This commit is contained in:
cghislai 2025-06-09 15:06:57 +02:00
parent fe43b72baf
commit f2568707e9
51 changed files with 62444 additions and 9 deletions

View File

@ -0,0 +1,17 @@
# Google Cloud configuration
GOOGLE_CLOUD_PROJECT_ID=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1
GEMINI_MODEL=gemini-1.5-pro
GOOGLE_API_KEY=your_api_key
STORAGE_BUCKET_NAME=bucketname
# API configuration
API_BASE_URL=https://api.example.com
API_AUTH_URL=https://api.example.com
API_CLIENT_ID=your_api_key
API_CLIENT_SECRET=your_api_key
# Function configuration
DEBUG=false
DRY_RUN_SKIP_GEMINI=true
DRY_RUN_SKIP_API_WRITE=true

View File

@ -0,0 +1,4 @@
node_modules/
dist/
.env
coverage/

View File

@ -0,0 +1,110 @@
# Field Request to Field Value Function
This function takes a project and optional documentId and fieldIdentificationRequestId, fetches project info, and uses field identification services to extract field values.
## Overview
The function follows this process:
1. Receives a request with project information and optional document/field request IDs
2. Validates the request parameters
3. Fetches project information
4. Fetches resources based on the request parameters
5. Uses Gemini AI with function calls to identify field values
6. Returns the identified field value
## Services
The function is composed of three main services:
### ProcessorService
- Orchestrates the entire field identification process
- Validates request parameters
- Fetches project information
- Calls the FieldIdentificationService
### FieldIdentificationService
- Fetches resources based on request inputs
- Calls the GeminiNitroService to identify field values
- Makes final API calls based on the Gemini response
### GeminiNitroService
- Interacts with Gemini AI using function calls
- Defines function declarations for Gemini
- Processes Gemini responses to extract field values
## Usage
### HTTP Endpoint
```bash
curl -X POST https://your-function-url/fieldRequestToFieldValueHttp \
-H "Content-Type: application/json" \
-d '{
"project": {
"name": "My Project",
"path": "/path/to/project"
},
"documentId": "doc-123",
"fieldIdentificationRequestId": "field-req-456"
}'
```
### Cloud Event
The function can also be triggered by a Cloud Event with the following data:
```json
{
"project": {
"name": "My Project",
"path": "/path/to/project"
},
"documentId": "doc-123",
"fieldIdentificationRequestId": "field-req-456"
}
```
## Configuration
The function requires the following environment variables:
| Variable | Description | Default |
|----------|-------------|---------|
| GOOGLE_CLOUD_PROJECT_ID | Google Cloud project ID | (required) |
| GOOGLE_CLOUD_LOCATION | Google Cloud location | us-central1 |
| GEMINI_MODEL | Gemini model to use | gemini-1.5-pro |
| API_BASE_URL | Base URL for API calls | https://api.example.com |
| API_KEY | API key for authentication | (required) |
| DRY_RUN_SKIP_GEMINI | Skip Gemini API calls in dry run mode | false |
## Development
### Running Locally
```bash
npm install
npm run build
npm run start
```
### Testing
```bash
npm test
```
### Deployment
```bash
gcloud functions deploy fieldRequestToFieldValueHttp \
--runtime nodejs18 \
--trigger-http \
--allow-unauthenticated
```
```bash
gcloud functions deploy fieldRequestToFieldValueEvent \
--runtime nodejs18 \
--trigger-event google.cloud.storage.object.finalize \
--trigger-resource your-bucket-name
```

View File

@ -0,0 +1,17 @@
#!/usr/bin/bash
API_DOC_URI="${API_DOC_URI:-https://api.nitrodev.ebitda.tech/domain-ws/q/openapi?format=yaml}"
WD="$(realpath $(dirname $0))"
ROOT_DIR="${WD}"
CLIENT_DIR="${ROOT_DIR}/src/client"
BIN_DIR="${ROOT_DIR}/node_modules/.bin"
rm -rf "${CLIENT_DIR}"
mkdir -p "${CLIENT_DIR}"
${BIN_DIR}/openapi-ts \
-c "@hey-api/client-fetch" \
-i ${API_DOC_URI} \
-o "${CLIENT_DIR}"

View File

@ -0,0 +1,27 @@
module.exports = {
preset: 'ts-jest',
testEnvironment: 'node',
roots: ['<rootDir>/src'],
testMatch: ['**/__tests__/**/*.ts', '**/?(*.)+(spec|test).ts'],
transform: {
'^.+\\.ts$': 'ts-jest',
},
moduleFileExtensions: ['ts', 'js', 'json', 'node'],
collectCoverage: true,
coverageDirectory: 'coverage',
collectCoverageFrom: [
'src/**/*.ts',
'!src/**/*.d.ts',
'!src/**/__tests__/**',
'!src/**/__mocks__/**',
],
coverageThreshold: {
global: {
branches: 20,
functions: 60,
lines: 40,
statements: 40,
},
},
setupFiles: ['<rootDir>/src/__tests__/setup.ts'],
};

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,44 @@
{
"name": "field-request-to-field-value",
"version": "1.0.0",
"scripts": {
"build": "tsc",
"generate-api-client": "sh generate-api-client.sh",
"prestart": "npm run build",
"deploy": "gcloud functions deploy fieldRequestToFieldValueHttp --gen2 --runtime=nodejs20 --source=. --trigger-http --allow-unauthenticated",
"deploy:event": "gcloud functions deploy fieldRequestToFieldValueEvent --gen2 --runtime=nodejs20 --source=. --trigger-event=google.cloud.storage.object.v1.finalized --trigger-resource=YOUR_BUCKET_NAME",
"clean": "rm -rf dist",
"test": "jest",
"test:watch": "jest --watch",
"dev": "npm run build && functions-framework --target=fieldRequestToFieldValueHttp --port=18082",
"dev:watch": "concurrently \"tsc -w\" \"nodemon --watch dist/ --exec functions-framework --target=fieldRequestToFieldValueHttp --port=18082\"",
"dev:event": "npm run build && functions-framework --target=fieldRequestToFieldValueEvent --signature-type=event"
},
"main": "dist/index.js",
"dependencies": {
"@google-cloud/functions-framework": "^3.0.0",
"@google-cloud/vertexai": "^0.5.0",
"@hey-api/client-fetch": "0.8.4",
"@hey-api/openapi-ts": "0.64.15",
"@google-cloud/storage": "^7.16.0",
"axios": "^1.6.7",
"dotenv": "^16.4.5",
"shared-functions": "file:../shared"
},
"devDependencies": {
"@types/express": "^5.0.3",
"@types/jest": "^29.5.12",
"@types/node": "^20.11.30",
"concurrently": "^8.2.2",
"jest": "^29.7.0",
"nodemon": "^3.0.3",
"ts-jest": "^29.1.2",
"typescript": "^5.8.3"
},
"engines": {
"node": ">=20"
},
"files": [
"dist"
]
}

View File

@ -0,0 +1,16 @@
/**
* Jest setup file
*
* This file is executed before each test file is run.
* It can be used to set up global test environment configurations.
*/
// Suppress console output during tests
global.console = {
...console,
log: jest.fn(),
error: jest.fn(),
warn: jest.fn(),
info: jest.fn(),
debug: jest.fn(),
};

View File

@ -0,0 +1,18 @@
// This file is auto-generated by @hey-api/openapi-ts
import type { ClientOptions } from './types.gen';
import { type Config, type ClientOptions as DefaultClientOptions, createClient, createConfig } from '@hey-api/client-fetch';
/**
* The `createClientConfig()` function will be called on client initialization
* and the returned object will become the client's initial configuration.
*
* You may want to initialize your client this way instead of calling
* `setConfig()`. This is useful for example if you're using Next.js
* to ensure your client always has the correct values.
*/
export type CreateClientConfig<T extends DefaultClientOptions = ClientOptions> = (override?: Config<DefaultClientOptions & T>) => Config<Required<DefaultClientOptions> & T>;
export const client = createClient(createConfig<ClientOptions>({
baseUrl: 'http://localhost:33000/domain-ws'
}));

View File

@ -0,0 +1,3 @@
// This file is auto-generated by @hey-api/openapi-ts
export * from './types.gen';
export * from './sdk.gen';

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,52 @@
/**
* Configuration for the field-request-to-field-value function
*/
import path from "path";
import * as dotenv from 'dotenv';
dotenv.config({path: path.resolve(__dirname, '../.env')});
// Google Cloud configuration
export const DEBUG = process.env.DEBUG === 'true';
export const GOOGLE_CLOUD_PROJECT_ID = process.env.GOOGLE_CLOUD_PROJECT_ID || '';
export const GOOGLE_CLOUD_LOCATION = process.env.GOOGLE_CLOUD_LOCATION || 'us-central1';
export const GEMINI_MODEL = process.env.GEMINI_MODEL || 'gemini-1.5-pro';
// Dry run configuration
export const DRY_RUN_SKIP_GEMINI = process.env.DRY_RUN_SKIP_GEMINI === 'true';
export const DRY_RUN_SKIP_API_WRITE = process.env.DRY_RUN_SKIP_API_WRITE === 'true';
// API configuration
export const API_BASE_URL = process.env.API_BASE_URL || 'https://api.example.com';
export const API_AUTH_URL = process.env.API_AUTH_URL || 'https://api.example.com';
export const API_CLIENT_ID = process.env.API_CLIENT_ID || 'https://api.example.com';
export const API_CLIENT_SECRET = process.env.API_CLIENT_SECRET || 'https://api.example.com';
// Cloud storage configuration
export const STORAGE_BUCKET_NAME = process.env.STORAGE_BUCKET_NAME || 'test-spec-to-test-implementation';
/**
* Validate the configuration
* @throws Error if any required configuration is missing
*/
export function validateConfig(): void {
if (!GOOGLE_CLOUD_PROJECT_ID) {
throw new Error('GOOGLE_CLOUD_PROJECT_ID environment variable is required');
}
if (!API_AUTH_URL) {
throw new Error('API_AUTH_URL environment variable is required');
}
if (!API_BASE_URL) {
throw new Error('API_BASE_URL environment variable is required');
}
if (!API_CLIENT_ID) {
throw new Error('API_CLIENT_ID environment variable is required');
}
if (!API_CLIENT_SECRET) {
throw new Error('API_CLIENT_SECRET environment variable is required');
}
}

View File

@ -0,0 +1,98 @@
/**
* Field Request to Field Value Function
*
* This function takes a project and optional documentId and fieldIdentificationRequestId,
* fetches project info, and uses field identification services to extract field values.
*/
import {CloudEvent, cloudEvent, http} from '@google-cloud/functions-framework';
import {ProcessorService} from './services/processor-service';
import {validateConfig, DRY_RUN_SKIP_GEMINI} from './config';
import {FieldIdentificationRequestInput, FieldIdentificationResults, HttpResponse} from './types';
// Validate configuration on startup
try {
validateConfig();
} catch (error) {
console.error('Configuration error:', error instanceof Error ? error.message : String(error));
// Don't throw here to allow the function to start, but it will fail when executed
}
/**
* Format process results into a concise HTTP response
* @param result Field identification result
* @returns Formatted HTTP response
*/
export function formatHttpResponse(result: FieldIdentificationResults): HttpResponse {
return {
...result
};
}
/**
* HTTP endpoint for the field-request-to-field-value function
*/
http('fieldRequestToFieldValueHttp', async (req, res): Promise<void> => {
try {
// Extract request parameters
const {projectId, documentId} = {
projectId: req.query.projectId as string ?? req.body.projectId as string,
documentId: req.query.documentId as string ?? req.body.documentId as string,
};
if (!projectId) {
res.status(400).json({
success: false,
error: 'Project is required'
});
return;
}
// Process the field identification request
const processor = new ProcessorService();
const result = await processor.processFieldIdentification({
projectId,
documentId,
});
// Format and return the response
const response = formatHttpResponse(result);
res.status(200).json(response);
} catch (error) {
console.error('Error processing field identification:', error);
const errorMessage = error instanceof Error ? error.message : String(error);
res.status(500).json({
success: false,
error: errorMessage
});
}
});
/**
* Cloud Event handler for the field-request-to-field-value function
*/
cloudEvent('fieldRequestToFieldValueEvent', async (event: CloudEvent<FieldIdentificationRequestInput>): Promise<void> => {
try {
console.log('Received event:', event.type);
// Extract request parameters from the event
const {projectId, documentId} = event.data || {};
if (!projectId) {
console.error('Error: Project is required');
throw new Error('Project is required');
}
// Process the field identification request
const processor = new ProcessorService();
const result = await processor.processFieldIdentification({
projectId,
documentId,
});
console.log('Field identification completed successfully');
} catch (error) {
console.error('Error processing field identification:', error);
throw error;
}
});

View File

@ -0,0 +1,59 @@
/**
* Service for field identification
*/
import {GeminiNitroService} from './gemini-nitro-service';
import {GoogleCloudStorageConfig, NitroDocumentsService} from "./nitro-documents-service";
import {NitroAuthService} from "./nitro-auth-service";
import {FieldIdentificationResults} from "../types";
export class FieldIdentificationService {
private nitroDocumentService: NitroDocumentsService;
constructor(private geminiNitroService: GeminiNitroService,
nitroAuthService: NitroAuthService,
storageConfig: GoogleCloudStorageConfig) {
this.nitroDocumentService = new NitroDocumentsService(nitroAuthService, storageConfig)
}
/**
* Identify a field value based on project info and document/field request IDs
* @param documentId Optional document ID
* @returns Field identification result with field value
*/
async indexDocument(
projectId: string,
documentId?: number,
): Promise<FieldIdentificationResults> {
console.log(`Identifying field for project: ${projectId}, documentId: ${documentId}`);
if (documentId) {
return this.processSingleDocument(projectId, documentId);
} else {
// Fetch some documents to process
const document = await this.nitroDocumentService.findADocumentToProcess();
if (document) {
return this.processSingleDocument(projectId, document.id!);
} else {
throw new Error('No documents found to process');
}
}
}
private async processSingleDocument(projectId: string, documentId: number): Promise<FieldIdentificationResults> {
const fieldRequests = await this.nitroDocumentService.searchFieldRequests(documentId);
const fields = await this.nitroDocumentService.getDocumentFields();
const fieldResults = await this.geminiNitroService.identifyFieldValues(projectId, documentId, fieldRequests, fields);
const submittedValues = fieldResults.fieldValues.filter(value => value.valueStatus === "SUBMITTED");
const problematicValues = fieldResults.fieldValues.filter(value => value.valueStatus === "PROBLEM");
const skippedFields = fields.length - fieldResults.fieldValues.length
return {
documentId: documentId,
indexedFields: submittedValues.length,
problematicFields: problematicValues.length,
skippedFields: skippedFields,
project: projectId
}
}
}

View File

@ -0,0 +1,110 @@
/**
* Service for parsing field information from markdown files
*/
import * as fs from 'fs';
import * as path from 'path';
import {FieldInfo, ProjectFields} from '../types';
import {ProjectService} from 'shared-functions';
export class FieldParserService {
private projectService: ProjectService;
constructor(projectService?: ProjectService) {
this.projectService = projectService || new ProjectService();
}
/**
* Get all fields for a project
* @param projectId Project ID
* @returns Project fields information
*/
async getProjectFields(projectId: string): Promise<ProjectFields> {
// Get the main repository path using ProjectService
const mainRepoPath = await this.projectService.getMainRepositoryPath();
const promptsDir = path.join(mainRepoPath, 'src', 'prompts');
const functionName = 'field-request-to-field-value';
// Find all projects and filter for the requested one
const projects = await this.projectService.findProjects(promptsDir, functionName);
const project = projects.find(p => p.name === projectId);
if (!project) {
throw new Error(`Project not found: ${projectId}`);
}
// Construct the path to the project fields directory
const projectFieldsDir = path.join(project.path, 'fields');
// Check if the project fields directory exists
if (!fs.existsSync(projectFieldsDir)) {
throw new Error(`Project fields directory not found: ${projectFieldsDir}`);
}
// Get all field files in the directory
const fieldFiles = fs.readdirSync(projectFieldsDir)
.filter(file => file.endsWith('.md'));
// Parse each field file
const fields: FieldInfo[] = [];
for (const fieldFile of fieldFiles) {
const fieldPath = path.join(projectFieldsDir, fieldFile);
const fieldInfo = this.parseFieldFile(fieldPath);
fields.push(fieldInfo);
}
console.debug(`Parsed ${fields.length} fields for project: ${projectId}`)
return {
projectId,
fields
};
}
/**
* Parse a field file to extract field information
* @param filePath Path to the field file
* @returns Field information
*/
private parseFieldFile(filePath: string): FieldInfo {
// Read the field file
const content = fs.readFileSync(filePath, 'utf-8');
// Extract the field name from the file name
const fileName = path.basename(filePath, '.md');
// Extract function IDs and active status
const functionIds: string[] = [];
let isActive = false;
// Parse the content line by line
const lines = content.split('\n');
const promptLines: string[] = [];
let promptEnded = false;
for (const line of lines) {
// Check for function declarations
const functionMatch = line.match(/- \[(x)\] Function: (.+)/);
if (functionMatch) {
const functionId = functionMatch[2].trim();
functionIds.push(functionId);
promptEnded = true;
}
// Check for active status
const activeMatch = line.match(/- \[(x| )\] Active/);
if (activeMatch) {
isActive = activeMatch[1] === 'x';
promptEnded = true;
}
if (!promptEnded) {
promptLines.push(line);
}
}
return {
name: fileName,
prompt: promptLines.join('\n'),
functionIds,
isActive
};
}
}

View File

@ -0,0 +1,719 @@
/**
* Service for interacting with Gemini using function calls to the Nitro backend
*/
import {
Content,
FunctionCall,
FunctionDeclaration,
FunctionDeclarationSchemaType,
FunctionResponse,
GenerativeModelPreview,
VertexAI
} from '@google-cloud/vertexai';
import {WsDocumentField, WsFieldIdentificationRequest, WsFieldIdentificationValue} from "../client/index";
import {NitroIndexingService} from "./nitro-indexing-service";
import {GoogleCloudStorageConfig, NitroDocumentsService} from "./nitro-documents-service";
import {FieldInfo} from "../types";
import {NitroThirdpartyService} from "./nitro-thirdparty-service";
import {FieldParserService} from "./field-parser-service";
import {NitroAuthService} from "./nitro-auth-service";
import {DEBUG, DRY_RUN_SKIP_API_WRITE} from "../config";
export interface FieldIdentificationResult {
fieldValues: WsFieldIdentificationValue[]
modelResponsesPerFieldCode: Record<string, string[]>;
tokenCount: number;
}
type FunctionId =
'setStringIdentification'
| 'setFieldProblematic'
| 'findThirdPartyByIdentifier'
| 'findThirdPartyByName'
| 'setThirdPartyIdentification'
| 'setCurrencyAmount'
| 'setTaxBreakDownIdentification'
| 'listDetails'
| 'listDocumentTypes'
| 'listPaymentModes'
| 'setDetailsBreakDownIdentification';
export type StringIdentificationType =
'currencyCode'
| 'date'
| 'dateTime'
| 'documentType'
| 'year'
| 'payerType'
| 'paymentMode'
| 'paymentStatus'
| 'structuredReference';
export interface FunctionArgs {
value?: string;
problemType?: "ABSENT" | "THIRDPARTY_DOES_NOT_EXISTS" | "THIRDPARTY_NOT_IDENTIFIABLE";
description?: string;
identifierType?: 'email' | 'phone' | 'VAT' | 'IBAN' | 'nationalEnterpriseNumber';
stringType?: StringIdentificationType;
identifierValue?: string;
countryCode?: string;
thirdPartyType?: 'company' | 'person' | 'official';
names?: string;
id?: number;
amount?: number;
taxBreakdown?: any;
detailsBreakdown?: any;
}
export class GeminiNitroService {
private vertexAI: VertexAI;
private model: string;
private projectId: string;
private location: string;
private dryRunSkipGemini: boolean;
private nitroIndexingServiec: NitroIndexingService;
private nitroDocumentService: NitroDocumentsService;
private nitroThirdPartyService: NitroThirdpartyService;
private fieldParser: FieldParserService;
private generativeModel: GenerativeModelPreview;
private functionDeclarations: Record<FunctionId, FunctionDeclaration>;
/**
* Create a new GeminiNitroService instance
* @param projectId Google Cloud project ID
* @param location Google Cloud location
* @param model Gemini model to use
* @param dryRunSkipGemini Whether to skip Gemini API calls in dry run mode
*/
constructor(
projectId: string,
location: string = 'us-central1',
model: string = 'gemini-1.5-pro',
dryRunSkipGemini: boolean = false,
nitroAUthService: NitroAuthService,
storageConfig: GoogleCloudStorageConfig
) {
this.projectId = projectId;
this.location = location;
this.model = model;
this.dryRunSkipGemini = dryRunSkipGemini;
this.fieldParser = new FieldParserService();
this.nitroDocumentService = new NitroDocumentsService(nitroAUthService, storageConfig);
this.nitroIndexingServiec = new NitroIndexingService(nitroAUthService);
this.nitroThirdPartyService = new NitroThirdpartyService(nitroAUthService);
if (!this.projectId) {
throw new Error('Google Cloud Project ID is required');
}
// Initialize VertexAI with default authentication
this.vertexAI = new VertexAI({
project: this.projectId,
location: this.location,
apiEndpoint: 'aiplatform.googleapis.com'
});
console.debug("Instanciating model...");
this.functionDeclarations = this.defineFunctionDeclarations();
this.generativeModel = this.vertexAI.preview.getGenerativeModel({
model: this.model,
tools: [{
function_declarations: Object.values(this.functionDeclarations)
}]
});
console.debug("Model ready");
}
/**
* Identify a field value using Gemini with function calls
* @param resources API resources
* @param documentId Optional document ID
* @param fieldIdentificationRequestId Optional field identification request ID
* @returns Field identification result
*/
async identifyFieldValues(
projectId: string,
documentId: number,
fieldRequests: WsFieldIdentificationRequest[],
fields: WsDocumentField[]
): Promise<FieldIdentificationResult> {
console.log('Identifying field value using Gemini Nitro service');
// If dry run is enabled, return a mock field value
if (this.dryRunSkipGemini) {
console.log(`[DRY RUN] Skipping Gemini API call for field identification`);
return {
fieldValues: [],
modelResponsesPerFieldCode: {},
tokenCount: 0,
};
}
try {
// Define function declarations for Gemini
const fieldsModel = await this.fieldParser.getProjectFields(projectId);
const documentFile = await this.nitroDocumentService.getDocumentFile(documentId);
console.log(`Indexing document file ${documentFile.id} ${documentFile.documentFileType} `)
const documentFileStoredFile = await this.nitroDocumentService.getDocumentFileStoredFile(documentFile.id!);
const documentFileUri = await this.nitroDocumentService.getDocumentBucketUri(projectId, documentFile);
console.log(`Indexing document file at ${documentFileUri}`)
console.log("Starting gemini session for document id: " + documentId);
// Create a generative model with function calling capabilities
const promptCOntents: Content[] = [{
role: 'user', parts: [
{
text: `Here is a document from which information must be extracted: ${documentFileStoredFile.fileName}`,
},
{
file_data: {
mime_type: documentFileStoredFile.fileType,
file_uri: documentFileUri
}
},
{
text: `You will be tasked to identify specific information in the document, sequentially.`,
},
{
text: this.generateFunctionCallHelp(Object.values(this.functionDeclarations))
}
]
}];
const result = await this.indexFields(fieldRequests, fields, fieldsModel.fields, promptCOntents);
return result;
} catch (error) {
console.error('Error in Gemini Nitro service:', error);
throw new Error(`Gemini Nitro service error: ${error instanceof Error ? error.message : String(error)}`);
}
}
private async indexFields(fieldRequests: WsFieldIdentificationRequest[],
fields: WsDocumentField[],
fieldsModel: FieldInfo[],
initialPromptContents: Content[],
result: FieldIdentificationResult = {
fieldValues: [],
modelResponsesPerFieldCode: {},
tokenCount: 0,
}): Promise<FieldIdentificationResult> {
let conversationContent = initialPromptContents;
console.debug(`${fieldRequests.length} requests to process...`);
const startDate = new Date();
for (const fieldRequest of fieldRequests) {
console.debug(`Pocessing request ${fieldRequest.id}...`);
const fieldCode = fieldRequest.documentFieldCode;
const field = fields.find(f => f.code === fieldCode);
const fieldInfo = fieldsModel.find(f => f.name === fieldCode);
const updatedContant = await this.identifyField(fieldRequest, field, fieldInfo, conversationContent, result);
conversationContent = updatedContant;
}
const endTime = new Date();
const durationSeconds = (endTime.getTime() - startDate.getTime()) / 1000;
console.debug(`${fieldRequests.length} requests processed in ${durationSeconds} second`);
return result;
}
private async identifyField(fieldRequest: WsFieldIdentificationRequest,
field: WsDocumentField | undefined,
fieldInfo: FieldInfo | undefined,
currentPromptContent: Content[],
result: FieldIdentificationResult) {
if (field == null || fieldInfo == null) {
console.warn(`Skipping field ${fieldRequest.documentFieldCode} as it is not in the fields model`)
return currentPromptContent;
}
if (!fieldInfo.isActive) {
console.warn(`Skipping field ${fieldRequest.documentFieldCode} as it is not disabled`)
return currentPromptContent;
}
console.log(`Field ${fieldInfo.name} indexing starting...`);
const fieldStartData = new Date();
const fieldFunctions = Object.keys(this.functionDeclarations)
.filter(key => fieldInfo.functionIds.includes(key as FunctionId))
.map(key => this.functionDeclarations[key as FunctionId]);
// Call Gemini with the prompt
const requestContent = [
...currentPromptContent,
{
role: "user",
parts: [{
text: `Your task is now to indentify this information in document: ${JSON.stringify(field)}`
}, {
text: fieldInfo.prompt
}]
}
];
const conversationContent = await this.processUntilIndexed(requestContent, fieldRequest, fieldInfo, fieldFunctions, result);
const endTime = new Date();
const durationSeconds = (endTime.getTime() - fieldStartData.getTime()) / 1000;
console.debug(`Field indexed: ${fieldInfo.name} - ${durationSeconds} seconds`);
return conversationContent;
}
private async processUntilIndexed(requestContent: Content[],
fieldRequest: WsFieldIdentificationRequest,
fieldInfo: FieldInfo,
fieldFunctions: FunctionDeclaration[],
result: FieldIdentificationResult): Promise<Content[]> {
if (DEBUG) {
console.log(`request contents:`, requestContent)
}
const fieldResult = await this.generativeModel.generateContent({
contents: requestContent,
tools: [{
function_declarations: fieldFunctions,
}],
generation_config: {
temperature: 0.1,
max_output_tokens: 8192,
}
});
const candidates = fieldResult.response.candidates ?? [];
if (candidates.length === 0) { // No candidates
console.log(`no candidates for field ${fieldInfo.name}`);
return requestContent;
} else if (candidates.length > 1) {
console.log(`${candidates.length} candidates for field ${fieldInfo.name}`);
return requestContent;
}
const candidate = candidates[0];
const functionCalls: FunctionCall[] = [];
const outputForField = result.modelResponsesPerFieldCode[fieldInfo.name] ?? [];
const totalTokenCount = fieldResult.response.usageMetadata?.totalTokenCount ?? 0;
console.debug(` got a response. Token count ${totalTokenCount}`)
result.tokenCount = totalTokenCount;
const content = candidate.content;
const parts = content?.parts ?? [];
parts.forEach(part => {
if (part.functionCall) {
functionCalls.push(part.functionCall);
} else if (part.text !== null) {
if (part.text) {
outputForField.push(part.text);
console.debug(part.text)
}
} else {
console.log(`Unknown part type: ${JSON.stringify(part)}`);
}
});
result.modelResponsesPerFieldCode[fieldInfo.name] = outputForField;
const conversationContent = [
...requestContent,
];
if (content && content.parts && content.parts.length > 0) {
conversationContent.push(content);
}
let fieldIndexationCompleted = false;
for (const functionCall of functionCalls) {
const functionId = functionCall.name as FunctionId;
const functionArgs = (typeof functionCall.args === 'string' ?
JSON.parse(functionCall.args) : functionCall.args) as FunctionArgs;
try {
const {
functionResponse,
fieldValue
} = await this.processFunctionCall(fieldRequest, functionId, functionArgs);
if (fieldValue) {
fieldIndexationCompleted = true;
const functionResponseContent = this.createFunctionExchangeContents(functionCall,
`Thank you. Your response has been stored with id ${fieldValue.id}.
Lets continue the identification process for other information...`);
conversationContent.push(...functionResponseContent);
} else if (functionResponse != null) {
const functionResponseContent = this.createFunctionExchangeContents(functionCall, functionResponse);
conversationContent.push(...functionResponseContent);
} else {
fieldIndexationCompleted = true;
console.warn(`Empty response for function call ${functionCall.name} for field ${fieldInfo.name}`);
const functionResponseContent = this.createFunctionExchangeContents(functionCall, `
There appear to be an issue with this function call. Lets continue the identification process for other information...
`);
conversationContent.push(...functionResponseContent);
}
} catch (error) {
fieldIndexationCompleted = true;
const errorMessage = error instanceof Error ? error.message : String(error);
console.warn(`Error for function call ${functionCall.name} for field ${fieldInfo.name}: ${errorMessage}`, error);
const functionResponseContent = this.createFunctionExchangeContents(functionCall, `
There appear to be an issue with this function call: ${errorMessage}.
Lets continue the identification process for other information...
`);
conversationContent.push(...functionResponseContent);
}
}
if (fieldIndexationCompleted) {
return conversationContent;
} else {
return this.processUntilIndexed(conversationContent, fieldRequest, fieldInfo, fieldFunctions, result);
}
}
private async processFunctionCall(fieldRequest: WsFieldIdentificationRequest, functionId: FunctionId, args: FunctionArgs): Promise<{
functionResponse?: any,
fieldValue?: WsFieldIdentificationValue,
}> {
console.debug(`Field ${fieldRequest.documentFieldCode}: Processing function call ${functionId} with args ${this.formatArgs(args)}`);
const dryRunValue: WsFieldIdentificationValue = {
valueStatus: "DISPLAYED",
identificationRequestWsRef: {id: fieldRequest.id!},
id: 0,
};
// Execute the function
switch (functionId) {
case "setStringIdentification":
return {
fieldValue: DRY_RUN_SKIP_API_WRITE ? dryRunValue : await this.nitroIndexingServiec.setStringIdentification(fieldRequest, args.stringType, args.value),
}
case "setFieldProblematic":
return {
fieldValue: DRY_RUN_SKIP_API_WRITE ? dryRunValue : await this.nitroIndexingServiec.setFieldProblem(fieldRequest, args.problemType, args.description),
}
case "findThirdPartyByIdentifier":
return {
functionResponse: await this.nitroThirdPartyService.findThirdPartyByIdentifier(args.identifierType, args.identifierValue, args.countryCode, args.thirdPartyType),
}
case "findThirdPartyByName":
return {
functionResponse: await this.nitroThirdPartyService.findThirdPartyByName(args.names, args.countryCode, args.thirdPartyType),
}
case "setThirdPartyIdentification":
return {
fieldValue: DRY_RUN_SKIP_API_WRITE ? dryRunValue : await this.nitroIndexingServiec.setThirdPartyIdentification(fieldRequest, args.id),
}
case "setCurrencyAmount":
return {
fieldValue: DRY_RUN_SKIP_API_WRITE ? dryRunValue : await this.nitroIndexingServiec.setCurrencyAmount(fieldRequest, args.amount),
}
case "setTaxBreakDownIdentification":
return {
fieldValue: DRY_RUN_SKIP_API_WRITE ? dryRunValue : await this.nitroIndexingServiec.setTaxBreadDownIdentification(fieldRequest, args.taxBreakdown),
}
case "listDetails":
return {
functionResponse: await this.nitroDocumentService.listDetails(fieldRequest),
}
case "listDocumentTypes":
return {
functionResponse: await this.nitroDocumentService.listDocumnentTypes(fieldRequest),
}
case "listPaymentModes":
return {
functionResponse: await this.nitroDocumentService.listPaymentModes(fieldRequest),
}
case "setDetailsBreakDownIdentification":
return {
fieldValue: DRY_RUN_SKIP_API_WRITE ? dryRunValue : await this.nitroIndexingServiec.setDetailsBreakDownIdentification(fieldRequest, args.detailsBreakdown),
}
default:
throw new Error(`Unknown function ${functionId}`);
}
}
private createFunctionExchangeContents(
functionCall: FunctionCall,
responseData: any,
): Content[] {
// Create a function response object
const functionResponseObj: FunctionResponse = {
name: functionCall.name,
response: {
data: JSON.stringify(responseData),
},
};
return [
{
role: 'ASSISTANT',
parts: [
{
functionCall: functionCall
}
]
},
{
role: 'USER',
parts: [
{
functionResponse: functionResponseObj
}
]
}
];
}
private formatArgs(args: FunctionArgs) {
let value = "";
Object.keys(args).forEach(key => {
if (args.hasOwnProperty(key)) {
value += `${key}=${args[key as keyof FunctionArgs] as any}, `
}
});
return value;
}
private generateFunctionCallHelp(functionDeclarations1: FunctionDeclaration[]) {
return `
During the identification process, you will have access to some of those functions:
${functionDeclarations1.map(f =>
`${f.name}(${this.formatParameters(f)}): ${f.description}`
)}
`;
}
private formatParameters(functionDeclaration: FunctionDeclaration) {
const params = functionDeclaration.parameters?.properties ?? {};
return Object.keys(params)
.join(', ');
}
/**
* Define function declarations for Gemini
* @returns Record of function declarations
*/
private defineFunctionDeclarations(): Record<FunctionId, FunctionDeclaration> {
return {
'findThirdPartyByIdentifier': {
name: 'findThirdPartyByIdentifier',
description: 'Searches for a third-party entity using a unique identifier such as email, phone, VAT, enterprise number or IBAN. Specify the identifier type, its value (without formatting, typically [A-Z0-9]+ except for email and national numbers), and the two-letter ISO 3166-1 alpha-2 country code. Optionally, specify the third-party type (company, person, or official).',
parameters: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {
identifierType: {
type: FunctionDeclarationSchemaType.STRING,
description: 'The type of identifier: \'email\', \'phone\', \'VAT\', \'nationalEnterpriseNumber\' or \'IBAN\'.'
+ 'National enterprise numbers might have various formats an names: BCE number, KBO number, SIRET, matricule, ...'
+ 'VAT number is a tax identification number in european format (starting with a country code in 2 capital letters); TVA, BTW are other names for it.',
enum: ['email', 'phone', 'VAT', 'IBAN', "nationalEnterpriseNumber"]
},
identifierValue: {
type: FunctionDeclarationSchemaType.STRING,
description: 'The value of the identifier. Should be unformatted, typically [A-Z0-9]+, except for email addresses and national numbers.'
},
countryCode: {
type: FunctionDeclarationSchemaType.STRING,
description: 'The two-letter ISO 3166-1 alpha-2 country code.'
},
thirdPartyType: {
type: FunctionDeclarationSchemaType.STRING,
description: 'The type of third party: \'company\', \'person\', or \'official\'.'
}
},
required: ['identifierType', 'identifierValue', 'countryCode']
},
},
'findThirdPartyByName': {
name: 'findThirdPartyByName',
description: 'Searches for a third-party entity using its name. Provide the full name (company, person\'s first and last name, or official name) and the two-letter ISO 3166-1 alpha-2 country code. Ensure the returned results accurately match the identified third party. Optionally, specify the third-party type (company, person, or official).',
parameters: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {
names: {
type: FunctionDeclarationSchemaType.STRING,
description: 'The company name, or person\'s first and last name, or the official name.'
},
countryCode: {
type: FunctionDeclarationSchemaType.STRING,
description: 'The two-letter ISO 3166-1 alpha-2 country code.'
},
thirdPartyType: {
type: FunctionDeclarationSchemaType.STRING,
description: 'The type of third party: \'company\', \'person\', or \'official\'.'
},
},
required: ['names', 'countryCode']
},
},
'setThirdPartyIdentification': {
name: 'setThirdPartyIdentification',
description: 'Assigns an identified third-party entity to the current context. Requires the backend ID of the third party.',
parameters: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {
id: {
type: FunctionDeclarationSchemaType.NUMBER,
description: 'The backend ID of the third party.'
},
},
required: ['id']
},
},
'setFieldProblematic': {
name: 'setFieldProblematic',
description: 'Flags a field as unidentifiable. Specify the `problemType` (e.g., "ABSENT", "THIRDPARTY_DOES_NOT_EXISTS", "THIRDPARTY_NOT_IDENTIFIABLE") and an optional `description` explaining the issue.',
parameters: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {
problemType: {
type: FunctionDeclarationSchemaType.STRING,
description: 'The type of problem preventing the field from being identified. Must be one of: "ABSENT", "THIRDPARTY_DOES_NOT_EXISTS", or "THIRDPARTY_NOT_IDENTIFIABLE".',
enum: ["ABSENT", "THIRDPARTY_DOES_NOT_EXISTS", "THIRDPARTY_NOT_IDENTIFIABLE"]
},
description: {
type: FunctionDeclarationSchemaType.STRING,
description: 'A detailed description of the problem.'
}
},
required: ['problemType']
},
},
'setStringIdentification': {
name: 'setStringIdentification',
description: 'Sets the identified string value for a field. Use this function when a field\'s string content has been successfully identified.',
parameters: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {
value: {
type: FunctionDeclarationSchemaType.STRING,
description: 'The string value identified'
},
stringType: {
type: FunctionDeclarationSchemaType.STRING,
description: 'The type/format of the string value, optional.',
enum: ['currencyCode'
, 'date'
, 'dateTime'
, 'documentType'
, 'year'
, 'payerType'
, 'paymentMode'
, 'paymentStatus'
, 'structuredReference']
}
},
required: ['value']
}
},
'setCurrencyAmount': {
name: 'setCurrencyAmount',
description: 'Sets a currency amount for a field. The amount should be provided as a string.',
parameters: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {
amount: {
type: FunctionDeclarationSchemaType.NUMBER,
description: 'The currency amount identified.'
},
},
required: ['amount']
}
},
'setTaxBreakDownIdentification': {
name: 'setTaxBreakDownIdentification',
description: 'Sets a tax breakdown identification for a field. Requires an array of `taxBreakdown` items, each containing `taxRate` (between 0 and 1), `baseAmount`, and `taxAmount`. An optional `totalAmount` can also be provided.',
parameters: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {
taxBreakdown: {
type: FunctionDeclarationSchemaType.ARRAY,
description: 'An array of tax breakdown items.',
items: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {
taxRate: {
type: FunctionDeclarationSchemaType.NUMBER,
description: 'The tax rate, a number between 0 and 1.'
},
baseAmount: {
type: FunctionDeclarationSchemaType.NUMBER,
description: 'The base amount for the tax calculation.'
},
taxAmount: {
type: FunctionDeclarationSchemaType.NUMBER,
description: 'The calculated tax amount.'
},
totalAmount: {
type: FunctionDeclarationSchemaType.NUMBER,
description: 'The total amount, inclusive of tax (optional).'
}
},
required: ['taxRate', 'baseAmount', 'taxAmount']
}
}
},
required: ['taxBreakdown']
}
},
'listDetails': {
name: 'listDetails',
description: 'Retrieves a list of all available details identifiers.',
parameters: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {}
}
},
'listDocumentTypes': {
name: 'listDocumentTypes',
description: 'Retrieves a list of all supported document types.',
parameters: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {}
}
},
'listPaymentModes': {
name: 'listPaymentModes',
description: 'Retrieves a list of all available payment modes.',
parameters: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {}
}
},
'setDetailsBreakDownIdentification': {
name: 'setDetailsBreakDownIdentification',
description: 'Sets a details breakdown identification for a field. Requires an array of `detailsBreakdown` items, each with `details` (the identifier) and `taxExclusiveAmount`. An optional `taxInclusiveAmount` can also be provided.',
parameters: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {
detailsBreakdown: {
type: FunctionDeclarationSchemaType.ARRAY,
description: 'An array of details breakdown items.',
items: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {
details: {
type: FunctionDeclarationSchemaType.STRING,
description: 'The details identifier.'
},
taxExclusiveAmount: {
type: FunctionDeclarationSchemaType.NUMBER,
description: 'The amount exclusive of tax.'
},
taxInclusiveAmount: {
type: FunctionDeclarationSchemaType.NUMBER,
description: 'The amount inclusive of tax (optional).'
}
},
required: ['details', 'taxExclusiveAmount']
}
}
},
required: ['detailsBreakdown']
}
}
};
}
}

View File

@ -0,0 +1,74 @@
import {NitroClientConfig} from "../types";
import axios from "axios";
import {Client, createClient, createConfig, Options} from "@hey-api/client-fetch";
export class NitroAuthService {
private authUri: string;
private clientId: string;
private clientSecret: string;
private accessToken: string | null = null;
private expiresIn: number = 0;
private obtainedAt: number = 0;
public client: Client;
constructor(
clientConfig: NitroClientConfig
) {
this.authUri = clientConfig.authUrl;
this.clientId = clientConfig.clientId;
this.clientSecret = clientConfig.clientSecret
this.client = createClient(createConfig({
baseUrl: clientConfig.apiUrl,
}))
this.client.interceptors.request.use(async (request: Request, options: Options) => {
const token = await this.getAuthToken();
request.headers.set('Authorization', `Bearer ${token}`);
return request;
})
}
async getAuthToken() {
if (this.checkTokenValid()) {
return this.accessToken;
}
return await this.obtainNewToken();
}
private checkTokenValid(): boolean {
if (!this.accessToken || !this.expiresIn || !this.obtainedAt) {
return false;
}
const now = Date.now();
const expirationTime = this.obtainedAt + (this.expiresIn * 1000); // expiresIn is in seconds
// Add a small buffer (e.g., 60 seconds) to avoid using an expired token
return expirationTime - now > 60 * 1000;
}
private async obtainNewToken(): Promise<string> {
const tokenUrl = `${this.authUri}/protocol/openid-connect/token`;
const params = new URLSearchParams();
params.append('grant_type', 'client_credentials');
params.append('client_id', this.clientId);
params.append('client_secret', this.clientSecret);
try {
const response = await axios.post(tokenUrl, params, {
headers: {
'Content-Type': 'application/x-www-form-urlencoded'
}
});
this.accessToken = response.data.access_token;
this.expiresIn = response.data.expires_in;
this.obtainedAt = Date.now();
return this.accessToken!;
} catch (error) {
console.error('Error obtaining new token:', error);
throw new Error('Failed to obtain new token');
}
}
}

View File

@ -0,0 +1,189 @@
import {
getCustomerDocumentById,
getCustomerDocumentByIdDetailsList,
getCustomerDocumentByIdFiles,
getCustomerDocumentFileByIdFile,
getCustomerDocumentFileByIdFileContent,
getDocumentFieldAll,
getDocumentTypeList,
getPaymentModeList,
postFieldIdentificationRequestSearch, WsCustomerDocument,
WsCustomerDocumentFile,
WsDocumentField,
WsDocumentTypeModel,
WsFieldIdentificationRequest,
WsFieldIdentificationRequestSearch,
WsPaymentModeModel,
WsStoredFile
} from "../client/index";
import {NitroAuthService} from "./nitro-auth-service";
import {Bucket, Storage} from '@google-cloud/storage';
export interface GoogleCloudStorageConfig {
bucketName: string;
}
export class NitroDocumentsService {
private storage: Storage; // Declare storage property
private bucket: Bucket;
constructor(
private authService: NitroAuthService,
private storageConfig: GoogleCloudStorageConfig
) {
this.storage = new Storage(); // Initialize Storage in the constructor
const bucketName = storageConfig.bucketName;
console.debug(`Accessing bucket ${bucketName}...`)
this.bucket = this.storage.bucket(bucketName);
console.debug(`Bucket ready`);
}
async searchFieldRequests(documentId: number): Promise<WsFieldIdentificationRequest[]> {
const fieldRequestSearch: WsFieldIdentificationRequestSearch = {
anyStatus: ["WAITING_FOR_INDEXING"],
customerDocumentSearch: {
exactWsCustomerDocumentWsRef: {id: documentId},
anyStatus: [
"SORTABLE",
"SORTED"
]
}
};
const result = await postFieldIdentificationRequestSearch({
client: this.authService.client,
query: {
first: 0, length: 100,
},
body: fieldRequestSearch,
});
return result.data?.itemList ?? []
}
async getDocumentFields(): Promise<WsDocumentField[]> {
const result = await getDocumentFieldAll({
client: this.authService.client,
});
return result.data ?? []
}
async getDocumentFile(documentId: number): Promise<WsCustomerDocumentFile> {
const result = await getCustomerDocumentByIdFiles({
client: this.authService.client,
path: {
id: documentId,
},
query: {
first: 0, length: 10,
},
});
const list = result.data?.itemList ?? [];
const filteredList = list.filter(file => file.documentFileType === "MAIN")
if (filteredList.length === 0) {
throw new Error(`No document files found for document ${documentId}`);
}
return filteredList[0];
}
async getDocumentFileStoredFile(documentFielId: number): Promise<WsStoredFile> {
const result = await getCustomerDocumentFileByIdFile({
client: this.authService.client,
path: {
id: documentFielId,
},
});
return result.data!;
}
async getDocumentFileContent(documentId: number, documentFielId: number): Promise<Blob> {
const result = await getCustomerDocumentFileByIdFileContent({
client: this.authService.client,
parseAs: "blob",
path: {
id: documentFielId,
},
});
return (await result.data) as Blob;
}
async listDetails(fieldRequest: WsFieldIdentificationRequest): Promise<string[]> {
const result = await getCustomerDocumentByIdDetailsList({
client: this.authService.client,
path: {
id: fieldRequest.customerDocumentWsRef.id,
},
});
return result.data ?? [];
}
async listDocumnentTypes(fieldRequest: WsFieldIdentificationRequest): Promise<WsDocumentTypeModel[]> {
const result = await getDocumentTypeList({
client: this.authService.client,
});
return result.data ?? [];
}
async listPaymentModes(fieldRequest: WsFieldIdentificationRequest): Promise<WsPaymentModeModel[]> {
const result = await getPaymentModeList({
client: this.authService.client,
});
return result.data ?? [];
}
async getDocumentBucketUri(projectId: string, documentFile: WsCustomerDocumentFile) {
const {bucketName} = this.storageConfig;
const documentPath = `project/${projectId}/documentsFile/${documentFile.id}`;
const uri = `gs://${bucketName}/${documentPath}`;
console.debug(`Checking for document at ${uri}...`);
const file = this.bucket.file(documentPath);
// Check if the file exists
const [exists] = await file.exists();
if (!exists) {
console.debug(`File does not exist. Uploading to: ${uri}`);
// Fetch the content of the file from your existing service
const fileContentBlob = await this.getDocumentFileContent(documentFile.customerDocumentWsRef.id!, documentFile.id!);
const fileContentBuffer = await fileContentBlob.arrayBuffer();
await file.save(Buffer.from(fileContentBuffer)); // Upload the file
console.debug(`File uploaded to: ${uri}`);
} else {
console.debug(`File already exists at: ${uri}`);
}
return uri;
}
async findADocumentToProcess(): Promise<WsCustomerDocument | undefined> {
const fieldRequestSearch: WsFieldIdentificationRequestSearch = {
anyStatus: ["WAITING_FOR_INDEXING"],
customerDocumentSearch: {
anyStatus: [
"SORTABLE",
"SORTED"
]
}
}
const response = await postFieldIdentificationRequestSearch({
client: this.authService.client,
query: {
first: 0, length: 1,
},
body: fieldRequestSearch,
});
const results = response.data?.itemList ?? [];
if (results.length === 0) {
return undefined;
}
return await getCustomerDocumentById({
client: this.authService.client,
path: {
id: results[0].customerDocumentWsRef.id,
},
}).then(
response => response.data
)
}
}

View File

@ -0,0 +1,257 @@
import {
postCurrencySearch,
postFieldIdentificationValue,
putFieldIdentificationValueById,
putFieldIdentificationValueByIdValueCurrencyIdentification,
putFieldIdentificationValueByIdValueDateIdentification,
putFieldIdentificationValueByIdValueDateTimeIdentification,
putFieldIdentificationValueByIdValueDecimalNumberIdentification,
putFieldIdentificationValueByIdValueDocumentTypeIdentification,
putFieldIdentificationValueByIdValuePayerEntityIdentification,
putFieldIdentificationValueByIdValuePaymentModeIdentification,
putFieldIdentificationValueByIdValuePaymentStatusIdentification,
putFieldIdentificationValueByIdValuePlainStringIdentification,
putFieldIdentificationValueByIdValueStructuredPaymentReferenceIdentification,
putFieldIdentificationValueByIdValueTaxBreakdown,
putFieldIdentificationValueByIdValueThirdPartyIdentification,
putFieldIdentificationValueByIdValueYearIdentification,
WsCurrency, WsCurrencySearch,
WsDocumentType,
WsFieldIdentificationRequest,
WsFieldIdentificationValue,
WsPayerEntity,
WsPaymentMode,
WsPaymentStatus,
WsTaxBreakdownLine
} from "../client/index";
import {NitroAuthService} from "./nitro-auth-service";
import {StringIdentificationType} from "./gemini-nitro-service";
import {asyncWrapProviders} from "node:async_hooks";
export class NitroIndexingService {
constructor(
private authService: NitroAuthService,
) {
}
async setStringIdentification(fieldRequest: WsFieldIdentificationRequest, type: StringIdentificationType | undefined, value: string | undefined): Promise<WsFieldIdentificationValue> {
const fieldValue = await this.createValueDraft(fieldRequest);
const response = await this.submitStringIdentification(fieldValue, type, value);
return await this.submitValue(this.getSubmittedValue(response));
}
async setFieldProblem(fieldRequest: WsFieldIdentificationRequest, problemType: "ABSENT" | "THIRDPARTY_DOES_NOT_EXISTS" | "THIRDPARTY_NOT_IDENTIFIABLE" | undefined, description: string | undefined): Promise<WsFieldIdentificationValue> {
const fieldValue = await this.createValueDraft(fieldRequest);
let problemValue: WsFieldIdentificationValue;
if (problemType === "ABSENT" || problemType === undefined) {
problemValue = {
...fieldValue,
identifiedValue: '',
}
return await this.submitValue(problemValue);
} else {
problemValue = {
...fieldValue,
fieldProblemType: problemType,
fieldProblemDetails: description ?? "",
};
return await this.submitProblematicValue(problemValue);
}
}
async setThirdPartyIdentification(fieldRequest: WsFieldIdentificationRequest, id: number | undefined): Promise<WsFieldIdentificationValue> {
const fieldValue = await this.createValueDraft(fieldRequest);
const response = await putFieldIdentificationValueByIdValueThirdPartyIdentification({
client: this.authService.client,
path: {
id: fieldValue.id!,
},
body: {
thirdPartyEntityWsRef: {id: id!},
},
});
return await this.submitValue(this.getSubmittedValue(response));
}
async setCurrencyAmount(fieldRequest: WsFieldIdentificationRequest, amount: number | undefined): Promise<WsFieldIdentificationValue> {
const fieldValue = await this.createValueDraft(fieldRequest);
const response = await putFieldIdentificationValueByIdValueDecimalNumberIdentification({
client: this.authService.client,
path: {
id: fieldValue.id!,
},
body: {
value: amount,
},
});
return await this.submitValue(this.getSubmittedValue(response));
}
async setTaxBreadDownIdentification(fieldRequest: WsFieldIdentificationRequest, taxBreakdown: WsTaxBreakdownLine[]): Promise<WsFieldIdentificationValue> {
const fieldValue = await this.createValueDraft(fieldRequest);
const response = await putFieldIdentificationValueByIdValueTaxBreakdown({
client: this.authService.client,
path: {
id: fieldValue.id!,
},
body: {
lines: taxBreakdown,
},
});
return await this.submitValue(this.getSubmittedValue(response));
}
async setDetailsBreakDownIdentification(fieldRequest: WsFieldIdentificationRequest, detailsBreakdown: WsTaxBreakdownLine[]): Promise<WsFieldIdentificationValue> {
const fieldValue = await this.createValueDraft(fieldRequest);
const response = await putFieldIdentificationValueByIdValueTaxBreakdown({
client: this.authService.client,
path: {
id: fieldValue.id!,
},
body: {
lines: detailsBreakdown,
},
});
return await this.submitValue(this.getSubmittedValue(response));
}
private async createValueDraft(fieldRequest: WsFieldIdentificationRequest) {
const value: WsFieldIdentificationValue = {
identificationRequestWsRef: {id: fieldRequest.id!},
valueStatus: "DISPLAYED",
}
const resposne = await postFieldIdentificationValue({
client: this.authService.client,
body: value,
});
return resposne.data!;
}
private async submitValue(value: WsFieldIdentificationValue) {
const submittedValue: WsFieldIdentificationValue = {
...value,
valueStatus: "SUBMITTED",
}
const response = await putFieldIdentificationValueById({
client: this.authService.client,
path: {
id: value.id!,
},
body: submittedValue,
});
return response.data!;
}
private async submitProblematicValue(value: WsFieldIdentificationValue) {
const submittedValue: WsFieldIdentificationValue = {
...value,
valueStatus: "PROBLEM",
}
const response = await putFieldIdentificationValueById({
client: this.authService.client,
path: {
id: value.id!,
},
body: submittedValue,
});
return response.data!;
}
private getSubmittedValue(response: {
data?: WsFieldIdentificationValue;
error?: any
}): WsFieldIdentificationValue {
if (response.error) {
throw response.error;
}
return response.data!;
}
private async submitStringIdentification(fieldValue: WsFieldIdentificationValue, type: StringIdentificationType | undefined, value: string | undefined): Promise<{
data?: WsFieldIdentificationValue;
error?: any
}> {
switch (type) {
case 'date':
return await putFieldIdentificationValueByIdValueDateIdentification({
client: this.authService.client,
path: {id: fieldValue.id!},
body: {value: value},
})
case 'dateTime':
return await putFieldIdentificationValueByIdValueDateTimeIdentification({
client: this.authService.client,
path: {id: fieldValue.id!},
body: {value: value},
})
case 'documentType':
return await putFieldIdentificationValueByIdValueDocumentTypeIdentification({
client: this.authService.client,
path: {id: fieldValue.id!},
body: {value: value as WsDocumentType},
})
case 'year':
return await putFieldIdentificationValueByIdValueYearIdentification({
client: this.authService.client,
path: {id: fieldValue.id!},
body: {value: parseInt(value!)},
})
case 'payerType':
return await putFieldIdentificationValueByIdValuePayerEntityIdentification({
client: this.authService.client,
path: {id: fieldValue.id!},
body: {value: value as WsPayerEntity},
})
case 'currencyCode': {
const currency = await this.searchCUrrencyByCode(value!);
return await putFieldIdentificationValueByIdValueCurrencyIdentification({
client: this.authService.client,
path: {id: fieldValue.id!},
body: {id: currency.id!},
})
}
case 'paymentMode':
return await putFieldIdentificationValueByIdValuePaymentModeIdentification({
client: this.authService.client,
path: {id: fieldValue.id!},
body: {value: value as WsPaymentMode},
})
case 'paymentStatus':
return await putFieldIdentificationValueByIdValuePaymentStatusIdentification({
client: this.authService.client,
path: {id: fieldValue.id!},
body: {value: value as WsPaymentStatus},
})
case 'structuredReference':
return await putFieldIdentificationValueByIdValueStructuredPaymentReferenceIdentification({
client: this.authService.client,
path: {id: fieldValue.id!},
body: {value: value},
})
default:
return await putFieldIdentificationValueByIdValuePlainStringIdentification({
client: this.authService.client,
path: {id: fieldValue.id!},
body: {value: value},
});
}
}
private async searchCUrrencyByCode(code: string): Promise<WsCurrency> {
const search: WsCurrencySearch = {
exactCode: code,
};
const response = await postCurrencySearch({
client: this.authService.client,
body: search,
});
const list = response.data?.itemList ?? [];
if (list.length === 0) {
throw new Error(`Currency with code ${code} not found`);
}
return list[0];
}
}

View File

@ -0,0 +1,263 @@
import {NitroAuthService} from "./nitro-auth-service";
import {
getCountryById,
getThirdPartyById,
getThirdPartyEntityById,
postThirdPartyEntityIdentifierSearch,
postThirdPartyEntitySearch,
WsRefWsThirdPartyEntity,
WsThirdParty,
WsThirdPartyEntity,
WsThirdPartyEntityIdentifierSearch,
WsThirdPartyEntitySearch,
WsThirdPartyType
} from "../client/index";
type ThirdpartyModel = Pick<WsThirdPartyEntity, 'id' | 'fullName'>
& Pick<WsThirdParty, 'thirdPartyType' | 'companyType' | 'city' | 'address' | 'zip' | 'enterpriseNumber' | 'vatNumber' | 'vatLiability' | 'officialName'>
& { countryCode: string };
export class NitroThirdpartyService {
constructor(
private authService: NitroAuthService,
) {
}
async findThirdPartyByIdentifier(identifierType: "email" | "phone" | "VAT" | "IBAN" | "nationalEnterpriseNumber" | undefined, identifierValue: string | undefined, countryCode: string | undefined, thirdPartyType: "company" | "person" | "official" | undefined): Promise<any> {
if (identifierValue === undefined) {
throw new Error(`No identifier value provided`);
}
const wsThirdPartyTYpe = this.getThirdPartyType(thirdPartyType);
if (identifierType === "email") {
return this.findThirdPartyByEmail(identifierValue, countryCode, wsThirdPartyTYpe);
} else if (identifierType === "phone") {
return this.findThirdPartyByPhone(identifierValue, countryCode, wsThirdPartyTYpe);
} else if (identifierType === "VAT") {
return this.findThirdPartyByVat(identifierValue, countryCode, wsThirdPartyTYpe);
} else if (identifierType === "nationalEnterpriseNumber") {
return this.findThirdPartyByEnterpriseNumber(identifierValue, countryCode, wsThirdPartyTYpe);
} else if (identifierType === "IBAN") {
return this.findThirdPartyByIban(identifierValue, countryCode, wsThirdPartyTYpe);
} else {
throw new Error(`No identifier type provided`);
}
}
async findThirdPartyByName(names: string | undefined, countryCode: string | undefined, thirdPartyType: "company" | "person" | "official" | undefined): Promise<any> {
const wsThirdPartyTYpe = this.getThirdPartyType(thirdPartyType);
const search: WsThirdPartyEntitySearch = {
fullNameSearch: names ? {
contains: names
} : undefined,
thirdPartySearch: {
wsCountrySearch: {
exactCode: countryCode,
},
anyThirdPartyType: wsThirdPartyTYpe ? [wsThirdPartyTYpe, "WAITING_VALIDATION"] : undefined
}
}
const result = await postThirdPartyEntitySearch({
client: this.authService.client,
query: {
first: 0, length: 10,
},
body: search,
});
const entityList = result.data?.itemList ?? [];
return await this.createModelsFromEntityList(entityList);
}
private async findThirdPartyByEmail(identifierValue: string, countryCode: string | undefined, thirdPartyType: WsThirdPartyType | undefined) {
const search: WsThirdPartyEntityIdentifierSearch = {
thirdPartyIdentifierSearch: {
anyType: ["EMAIL"],
exactValue: identifierValue
},
thirdPartyEntitySearch: {
thirdPartySearch: {
wsCountrySearch: {
exactCode: countryCode,
},
anyThirdPartyType: thirdPartyType ? [thirdPartyType, "WAITING_VALIDATION"] : undefined
}
}
}
const result = await postThirdPartyEntityIdentifierSearch({
client: this.authService.client,
query: {
first: 0, length: 10,
},
body: search,
});
const identifierList = result.data?.itemList ?? [];
const entityRefs = identifierList.map(i => i.thirdPartyEntityWsRef);
return await this.createModelsFromEntityRefList(entityRefs);
}
private async findThirdPartyByPhone(identifierValue: string, countryCode: string | undefined, thirdPartyType: WsThirdPartyType | undefined) {
const search: WsThirdPartyEntityIdentifierSearch = {
thirdPartyIdentifierSearch: {
anyType: ["PHONE_NUMBER"],
exactValue: identifierValue
},
thirdPartyEntitySearch: {
thirdPartySearch: {
wsCountrySearch: {
exactCode: countryCode,
},
anyThirdPartyType: thirdPartyType ? [thirdPartyType, "WAITING_VALIDATION"] : undefined
}
}
}
const result = await postThirdPartyEntityIdentifierSearch({
client: this.authService.client,
query: {
first: 0, length: 10,
},
body: search,
});
const identifierList = result.data?.itemList ?? [];
const entityRefs = identifierList.map(i => i.thirdPartyEntityWsRef);
return await this.createModelsFromEntityRefList(entityRefs);
}
private async findThirdPartyByIban(identifierValue: string, countryCode: string | undefined, thirdPartyType: WsThirdPartyType | undefined) {
const search: WsThirdPartyEntityIdentifierSearch = {
thirdPartyIdentifierSearch: {
anyType: ["IBAN"],
exactValue: identifierValue
},
thirdPartyEntitySearch: {
thirdPartySearch: {
wsCountrySearch: {
exactCode: countryCode,
},
anyThirdPartyType: thirdPartyType ? [thirdPartyType, "WAITING_VALIDATION"] : undefined
}
}
}
const result = await postThirdPartyEntityIdentifierSearch({
client: this.authService.client,
query: {
first: 0, length: 10,
},
body: search,
});
const identifierList = result.data?.itemList ?? [];
const entityRefs = identifierList.map(i => i.thirdPartyEntityWsRef);
return await this.createModelsFromEntityRefList(entityRefs);
}
private async findThirdPartyByVat(identifierValue: string, countryCode: string | undefined, thirdPartyType: WsThirdPartyType | undefined) {
const search: WsThirdPartyEntitySearch = {
thirdPartySearch: {
exactVat: identifierValue,
wsCountrySearch: {
exactCode: countryCode,
},
anyThirdPartyType: thirdPartyType ? [thirdPartyType, "WAITING_VALIDATION"] : undefined
}
}
const result = await postThirdPartyEntitySearch({
client: this.authService.client,
query: {
first: 0, length: 10,
},
body: search,
});
const entityList = result.data?.itemList ?? [];
return await this.createModelsFromEntityList(entityList);
}
private async findThirdPartyByEnterpriseNumber(identifierValue: string, countryCode: string | undefined, thirdPartyType: WsThirdPartyType | undefined) {
if (countryCode == null) {
throw new Error(`No country code provided, but it is required for national enterprise numbers`);
}
const search: WsThirdPartyEntitySearch = {
thirdPartySearch: {
exactEnterpriseNumber: identifierValue,
wsCountrySearch: {
exactCode: countryCode,
},
anyThirdPartyType: thirdPartyType ? [thirdPartyType, "WAITING_VALIDATION"] : undefined
}
}
const result = await postThirdPartyEntitySearch({
client: this.authService.client,
query: {
first: 0, length: 10,
},
body: search,
});
const entityList = result.data?.itemList ?? [];
return await this.createModelsFromEntityList(entityList);
}
private getThirdPartyType(thirdPartyType: "company" | "person" | "official" | undefined): WsThirdPartyType | undefined {
if (thirdPartyType === "company") {
return "LEGAL_ENTITY";
} else if (thirdPartyType === "person") {
return "PERSON_ENTITY";
} else if (thirdPartyType === "official") {
return "OFFICIAL_ENTITY";
}
return undefined;
}
private createModelsFromEntityRefList(refList: WsRefWsThirdPartyEntity[]) {
return Promise.all(
refList.map(ref => this.createModelForEntityRef(ref))
)
}
private createModelsFromEntityList(entityList: WsThirdPartyEntity[]) {
return Promise.all(
entityList.map(e => this.createModelForEntity(e))
)
}
private async createModelForEntityRef(ref: WsRefWsThirdPartyEntity): Promise<ThirdpartyModel> {
const entityResponse = await getThirdPartyEntityById({
client: this.authService.client,
path: {
id: ref.id,
},
});
const entity = entityResponse.data!;
return await this.createModelForEntity(entity);
}
private async createModelForEntity(entity: WsThirdPartyEntity) {
const thirdPartyResponse = await getThirdPartyById({
client: this.authService.client,
path: {
id: entity.thirdPartyWsRef.id,
},
});
const thirdParty = thirdPartyResponse.data!;
const countryResponse = await getCountryById({
client: this.authService.client,
path: {
id: thirdParty.countryWsRef!.id,
}
});
const country = countryResponse.data!;
return {
id: entity.id,
thirdPartyType: thirdParty.thirdPartyType,
address: thirdParty.address,
city: thirdParty.city,
companyType: thirdParty.companyType,
countryCode: country.code,
enterpriseNumber: thirdParty.enterpriseNumber,
fullName: entity.fullName,
officialName: thirdParty.officialName,
vatLiability: thirdParty.vatLiability,
vatNumber: thirdParty.vatNumber,
zip: thirdParty.zip,
}
}
}

View File

@ -0,0 +1,165 @@
/**
* Service for orchestrating the field identification process
*/
import {FieldIdentificationRequestInput, FieldIdentificationResults, NitroClientConfig, ProjectInfo} from '../types';
import {GeminiService, ProjectService} from 'shared-functions';
import {FieldIdentificationService} from './field-identification-service';
import {GeminiNitroService} from './gemini-nitro-service';
import {
API_AUTH_URL,
API_BASE_URL,
API_CLIENT_ID,
API_CLIENT_SECRET,
DRY_RUN_SKIP_GEMINI,
GEMINI_MODEL,
GOOGLE_CLOUD_LOCATION,
GOOGLE_CLOUD_PROJECT_ID,
STORAGE_BUCKET_NAME,
validateConfig
} from '../config';
import {GoogleCloudStorageConfig} from "./nitro-documents-service";
import * as fs from 'fs';
import * as path from 'path';
import {NitroAuthService} from "./nitro-auth-service";
export class ProcessorService {
private fieldIdentificationService: FieldIdentificationService;
private geminiNitroService: GeminiNitroService;
private nitroClientConfig: NitroClientConfig;
private geminiService: GeminiService;
private projectService: ProjectService;
constructor() {
// Validate configuration
validateConfig();
this.nitroClientConfig = {
apiUrl: API_BASE_URL,
authUrl: API_AUTH_URL,
clientId: API_CLIENT_ID,
clientSecret: API_CLIENT_SECRET
};
// Initialize services
this.geminiService = new GeminiService(
GOOGLE_CLOUD_PROJECT_ID,
GOOGLE_CLOUD_LOCATION,
GEMINI_MODEL,
DRY_RUN_SKIP_GEMINI
);
const nitroAuthService = new NitroAuthService(this.nitroClientConfig);
const storageConfig: GoogleCloudStorageConfig = {
bucketName: STORAGE_BUCKET_NAME
}
this.geminiNitroService = new GeminiNitroService(
GOOGLE_CLOUD_PROJECT_ID,
GOOGLE_CLOUD_LOCATION,
GEMINI_MODEL,
DRY_RUN_SKIP_GEMINI,
nitroAuthService,
storageConfig
);
this.fieldIdentificationService = new FieldIdentificationService(this.geminiNitroService, nitroAuthService, storageConfig);
this.projectService = new ProjectService();
}
/**
* Process a field identification request
* @param request Field identification request
* @returns Field identification result
*/
async processFieldIdentification(request: FieldIdentificationRequestInput): Promise<FieldIdentificationResults> {
console.log(`Processing field identification request for project: ${request.projectId}`);
// Validate request
this.validateRequest(request);
// Fetch project info
const projectInfo = await this.fetchProjectInfo(request.projectId);
try {
const docId = request.documentId ? parseInt(request.documentId) : undefined;
// Process field identification
const result = await this.fieldIdentificationService.indexDocument(
request.projectId,
docId
);
return result;
} catch (error) {
console.error(`Error processing field identification for project ${request.projectId}:`, error);
throw error;
}
}
/**
* Validate the field identification request
* @param request Field identification request
* @throws Error if the request is invalid
*/
private validateRequest(request: FieldIdentificationRequestInput): void {
if (!request.projectId) {
throw new Error('Project is required');
}
}
/**
* Fetch project information from the project directory
* @param projectId Project ID
* @returns Project information
* @throws Error if the project directory or INFO.md file doesn't exist
*/
private async fetchProjectInfo(projectId: string): Promise<ProjectInfo> {
console.log(`Fetching project info for: ${projectId}`);
// Get the main repository path using ProjectService
const mainRepoPath = await this.projectService.getMainRepositoryPath();
// Construct the path to the project directory and INFO.md file
const promptsDir = path.join(mainRepoPath, 'src', 'prompts');
const functionDir = path.join(promptsDir, 'field-request-to-field-value');
const projectDir = path.join(functionDir, projectId);
const infoFilePath = path.join(projectDir, 'INFO.md');
// Check if the project directory exists
if (!fs.existsSync(projectDir)) {
throw new Error(`Project directory not found: ${projectDir}`);
}
// Check if the INFO.md file exists
if (!fs.existsSync(infoFilePath)) {
throw new Error(`INFO.md file not found for project ${projectId}`);
}
// Read the INFO.md file
let infoContent: string;
try {
infoContent = fs.readFileSync(infoFilePath, 'utf-8');
} catch (error) {
throw new Error(`Failed to read INFO.md for project ${projectId}: ${error instanceof Error ? error.message : String(error)}`);
}
// Extract the project name from the first line (# Project Name)
const nameMatch = infoContent.match(/^#\s+(.+)$/m);
const projectName = nameMatch ? nameMatch[1].trim() : projectId;
// Extract the trustee filter and customer filter
const trusteeFilterMatch = infoContent.match(/- \[([ x])\] Trustee filter:\s*(.*)/);
const customerFilterMatch = infoContent.match(/- \[([ x])\] Customer filter:\s*(.*)/);
// Create the ProjectInfo object
const projectInfo: ProjectInfo = {
name: projectName,
path: projectDir,
trusteeFilter: trusteeFilterMatch ? trusteeFilterMatch[2].trim() : undefined,
customerFilter: customerFilterMatch ? customerFilterMatch[2].trim() : undefined
};
return projectInfo;
}
}

View File

@ -0,0 +1,69 @@
/**
* Type definitions for the field-request-to-field-value function
*/
import {Project} from "shared-functions";
import {GeminiResponse} from "shared-functions/dist/services/gemini-file-system-service";
/**
* Project information with additional fields for field-request-to-field-value
*/
export interface ProjectInfo extends Project {
trusteeFilter?: string;
customerFilter?: string;
}
/**
* Field information from markdown files
*/
export interface FieldInfo {
name: string;
prompt: string;
functionIds: string[];
isActive: boolean;
}
/**
* Project fields information
*/
export interface ProjectFields {
projectId: string;
fields: FieldInfo[];
}
/**
* Field identification request parameters
*/
export interface FieldIdentificationRequestInput {
projectId: string;
documentId?: string;
}
/**
* Field identification result
*/
export interface FieldIdentificationResults {
project: string;
indexedFields?: number
skippedFields?: number
problematicFields?: number
documentId?: number;
}
/**
* HTTP response format for the API
*/
export interface HttpResponse {
indexedFields?: number
skippedFields?: number
problematicFields?: number
documentId?: number;
}
export interface NitroClientConfig {
authUrl: string;
apiUrl: string;
clientId: string;
clientSecret: string;
}

View File

@ -0,0 +1,18 @@
{
"compilerOptions": {
"target": "ES2020",
"module": "CommonJS",
"outDir": "dist",
"strict": true,
"esModuleInterop": true,
"forceConsistentCasingInFileNames": true,
"skipLibCheck": true
},
"include": [
"src/**/*"
],
"exclude": [
"node_modules",
"dist"
]
}

View File

@ -648,14 +648,20 @@ Once you have completed all steps, call reportStepOutcome with outcome 'end'`,
const pendingFunctionCalls = [];
let endReceived = false;
// TODO: drop old content above 1M tokens
const updatedRequestContents = [
...request.contents,
];
// Process the streaming response
for await (const item of streamGenerateContentResult.stream) {
const inputTokens = item.usageMetadata?.promptTokenCount ?? 0;
const outputTokens = item.usageMetadata?.candidatesTokenCount ?? 0;
const totalTokens = item.usageMetadata?.totalTokenCount ?? 0;
geminiResponse.inputCost = (geminiResponse.inputCost ?? 0) + inputTokens;
geminiResponse.outputCost = (geminiResponse.outputCost ?? 0) + outputTokens;
geminiResponse.totalCost = (geminiResponse.totalCost ?? 0) + totalTokens;
geminiResponse.inputCost = (inputTokens ?? 0);
geminiResponse.outputCost = (outputTokens ?? 0);
geminiResponse.totalCost = (totalTokens ?? 0);
// Iterate over every part in the response
@ -667,7 +673,9 @@ Once you have completed all steps, call reportStepOutcome with outcome 'end'`,
console.warn(`Multiple (${generateContentCandidates.length}) candidates found in streaming response. Using the first one`);
}
const responseCandidate = generateContentCandidates[0];
const responseParts = responseCandidate.content?.parts || [];
const responseContent = responseCandidate.content;
const responseParts = responseContent.parts || [];
updatedRequestContents.push(responseContent);
if (responseParts.length === 0) {
console.warn(`No parts found in streaming response`);
@ -687,10 +695,6 @@ Once you have completed all steps, call reportStepOutcome with outcome 'end'`,
}
}
// TODO: drop old content above 1M tokens
const updatedRequestContents = [
...request.contents,
];
// Process any function calls that were detected
if (pendingFunctionCalls.length > 0) {

View File

@ -5,9 +5,39 @@
*/
import * as fs from 'fs';
import * as path from 'path';
import {Project} from '../types';
import * as process from 'process';
import {Project, RepoCredentials} from '../types';
import {RepositoryService} from './repository-service';
export class ProjectService {
private repositoryService: RepositoryService;
private mainRepoUrl: string;
private mainRepoCredentials?: RepoCredentials;
constructor(
repositoryService?: RepositoryService,
mainRepoUrl?: string,
mainRepoCredentials?: RepoCredentials
) {
this.repositoryService = repositoryService || new RepositoryService();
this.mainRepoUrl = mainRepoUrl || process.env.MAIN_REPO_URL || 'https://github.com/Ebitda-SRL/test-ai-code-agents.git';
// Set up credentials if provided
if (process.env.MAIN_REPO_TOKEN) {
this.mainRepoCredentials = {
type: 'token',
token: process.env.MAIN_REPO_TOKEN
};
} else if (process.env.MAIN_REPO_USERNAME && process.env.MAIN_REPO_PASSWORD) {
this.mainRepoCredentials = {
type: 'username-password',
username: process.env.MAIN_REPO_USERNAME,
password: process.env.MAIN_REPO_PASSWORD
};
} else if (mainRepoCredentials) {
this.mainRepoCredentials = mainRepoCredentials;
}
}
/**
* Find all projects in the prompts directory
* @param promptsDir Path to the prompts directory
@ -163,4 +193,33 @@ export class ProjectService {
throw new Error(`Failed to fetch remote data from ${uri}`);
}
}
/**
* Get the path to the main repository
* This will either use a local repository or clone the main repository
* based on the USE_LOCAL_REPO environment variable
* @returns Path to the main repository
*/
async getMainRepositoryPath(): Promise<string> {
let mainRepoPath: string;
const useLocalRepo = process.env.USE_LOCAL_REPO === 'true';
// Use local repository or clone the main repository
if (useLocalRepo) {
console.log('Using local repository path');
// When running with functions-framework, we need to navigate up to the project root
// Check if we're in the prompts-to-test-spec directory and navigate up if needed
const currentDir = process.cwd();
mainRepoPath = path.resolve(currentDir, '../../..');
console.log(`Resolved local repository path: ${mainRepoPath}`);
} else {
console.log(`Cloning main repository: ${this.mainRepoUrl}`);
mainRepoPath = await this.repositoryService.cloneMainRepository(
this.mainRepoUrl,
this.mainRepoCredentials
);
}
return mainRepoPath;
}
}

View File

@ -0,0 +1,40 @@
This file describes the AI guidelines for operations in this directory.
## Directory structure
- <project>/: A single project/environment repository
- INFO.md: Project information, including where the api is located
- AI.md: AI guidelines for field indexation specific to the project
- fields/: A directory containing fields-specific prompts
- <field-name>.md: A prompt file for a specific field
### File format
File format is markdown.
It contains checkboxes, that must only be checked if the information is available and provided.
#### Project info file format
A project info file follows the following format:
```markdown
## <Project name>
- [ ] Trustee filter: <a json containing a filter for trustees>
- [ ] Customer filter: <a json containing a filter for customers>
```
#### Fields prompt file format
A Field file follows the following format:
```markdown
## Field name
<Guidelines for indexing the field>
- [ ] Function: <name of a function to provide to ai>
- [ ] Active
```

View File

@ -0,0 +1,4 @@
# Nitro-dev
- [ ] Trustee filter:
- [ ] Customer filter:

View File

@ -0,0 +1,16 @@
## COMMENT
You must identify any unexpected or exotic properties of the document content.
For instance, illisibility, inaccuracies, fraud, or lacking information.
But also, hand-written staments, mixing of multiple documents in a single file, etc.
If the document can be considered normal, use the setFieldProblematic function to complete the task,
using "ABSENT" as problem type, and providing a description if applicable.
Otherwise, describe the particularities of the document in a short paragraph,
and use the setStringIdentification function to complete the task.
- [x] Function: setStringIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,15 @@
## DOCUMENT_TYPE
You must identify the currency of the amounts mentioned in the document.
If multiple currencies are present, use the one for the total amount.
Use the 3-letter ISO 4217 code when possible, and call the setStringIdentification function
with stringType=currency to complete the task.
If the currency is not identifiable, use the setFieldProblematic function to complete the task,
using "ABSENT" as problem type, and providing a description.
- [x] Function: setStringIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,33 @@
## DETAILS_BREAKDOWN
You must identify the breakdown of the document amounts according to a list
of identifiers.
For instance, some documents contain amounts targetting to different vehicles, and
the individual amounts for each vehicle must be identified.
Use the listDetails function to list the identifiers from the backend.
For each identifier, construct a json array containing objects with the following structure:
{
"details": "<the identifier>",
"taxExclusiveAmount": "1000.00",
"taxInclusiveAmount": "1000.00",
}
- the taxInclusiveAmount may be omitted.
- the identifier should come from the list of identifiers returned by the backend.
But if the sementics for the identifiers is clear, and that another one is clearly identifiable,
then an entry for this identifier may be added.
- the breakdown should sum up to the total amount for the document.
If the document contains the information to compute the breakdown,
use the setIdentifiersBreakDownIdentification function to complete the task.
Otherwise, use the setFieldProblematic function to complete the task,
using a problem type "ABSENT", and a description indicating why it is absent.
- [x] Function: listDetails
- [x] Function: setDetailsBreakDownIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,13 @@
## DOCUMENT_TYPE
You must identify the date of the document.
Format the date in iso format (YYYY-MM-DD) and use the setStringIdentification function
with stringType=date to complete the task.
If the date is not identifiable, use the setFieldProblematic function to complete the task,
using "ABSENT" as problem type, and providing a description.
- [x] Function: setStringIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,13 @@
## DOCUMENT_TYPE
You must identify the due date mentionnd on the document.
Format the date date in iso format (YYYY-MM-DD) and use the setStringIdentification function
with stringType=date to complete the task.
If the date is not identifiable, use the setFieldProblematic function to complete the task,
using "ABSENT" as problem type, and providing a description.
- [x] Function: setStringIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,12 @@
## DOCUMENT_NUMBER
You must identify the document number of the document.
Use the setStringIdentification function to complete the task.
If the document number is not identifiable, use the setFieldProblematic function to complete the task,
using "ABSENT" as problem type, and providing a description.
- [x] Function: setStringIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,15 @@
## DOCUMENT_TYPE
You must identify the type of the document.
Use the listDocumentTypes function to get the list of available document types.
Use the setStringIdentification with stringType=documentType to complete the task, passing the document type value.
If the type of the document is not listed, use the setFieldProblematic function to complete the task,
not mentioning any problem type, and providing the missing document type as description.
- [x] Function: listDocumentTypes
- [x] Function: setStringIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,15 @@
## FISCAL_YEAR
You must identify any fiscal year that this document would be related to.
Some documents are related to a specific fiscal year in the past or the future.
If this information is provided and non ambiguous,
use the setStringIdentification function with stringType=year to complete the task.
Otherwise, use the setFieldProblematic function to complete the task,
using "ABSENT" as problem type, and providing a description if applicable.
- [x] Function: setStringIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,12 @@
## INVOICE_NUMBER
You must identify the invoice number of the document. Credit not number are also accespted.
Use the setStringIdentification function to complete the task.
If the invoice number is not identifiable, use the setFieldProblematic function to complete the task,
using "ABSENT" as problem type, and providing a description.
- [x] Function: setStringIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,21 @@
## PAYER_ENTITY
You must identify whether the payer entity is the emitter or recipient company of this document,
or whether it is another third party.
If the information is unambiguous, and that the payer is the emitter or recipient company,
then use the setStringIdentification with stringType=payerType function to complete the task
and provide the value "ENTERPRISE".
If the information is unambiguous, and that the payer is neither the emitter or recipient company,
then use the setStringIdentification stringType=payerType function to complete the task
and provide the value "OTHER".
If the information is absent, ambiguous, or the payer is not a company, use the
setFieldProblematic function to complete the task and provide the value "ABSENT".
- [x] Function: setStringIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,15 @@
## PAYMENT_IBAN
You must identify the IBAN number of the bank account that is expected to receive the payment.
Look for all payment instructions and check if a single or main IBAN is mentioned.
If an IBAN is identified, use the setStringIdentification function to complete the task.
Provide the iban without any space or formatting ([A-Z0-9]+).
Otherwise, use the setFieldProblematic function to complete the task,
using "ABSENT" as problem type, and providing a description if applicable.
- [x] Function: setStringIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,16 @@
## PAYMENT_MODE
You must identify a unique mode of payment mentioned on the document.
Look for all payment instructions and check if a single payment mode is expected.
Call the listPaymentModes function to retrieve a list of payment modes supported.
If the payment is identified and listed, use the setStringIdentification with stringType=paymentMode function to complete the task.
Otherwise, use the setFieldProblematic function to complete the task,
using "ABSENT" as problem type, and providing the identified payment mode in description if applicable.
- [x] Function: listPaymentModes
- [x] Function: setStringIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,18 @@
## PAYMENT_STATUS
You must identify a payment status.
Look for any information indicating whether the payment is already paid, entirely or partially.
If the payment is entirely paid, use the setStringIdentification function to complete the task
and provide the value "PAID" and stringType "paymentStatus".
If the payment is patially paid, use the setStringIdentification function to complete the task
and provide the value "PARTIALLY_PAID" and stringType "paymentStatus".
Otherwise, use the setFieldProblematic function to complete the task
and provide the problem type "ABSENT" and stringType "paymentStatus".
- [x] Function: setStringIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,13 @@
## STRUCTURED_PAYMENT_REFERENCE
You must identify a structured payment reference present in the document
in the format "+++000/0000/00000+++", or less frequently just 12 digits.
Extract the 12 digits, and call the setStringIdentification function with stringType=structuredReference to complete the task.
If the structured payment reference is not identifiable, use the setFieldProblematic function to complete the task,
using "ABSENT" as problem type, and providing a description.
- [x] Function: setStringIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,28 @@
## TAX_BREAKDOWN
You must identify the tax breakdown by tax rate.
For each tax rate, the sum of the amounts for which this rate applies must be computed.
Construct a JSON array containing objects with the following structure:
{
"taxRate": "0.21",
"baseAmount": "1000.00",
"taxAmount": "21.00",
"totalAmount": "1021.00",
}
- the tax rate must be between 0 and 1 inclusives
- the amounts may be positive or negative
- If the total amount is the sum of the base amount and the tax amount,
it must be omitted.
If the document contains the information to compute the tax breakdown,
use the setTaxBreakDownIdentification function to complete the task.
Otherwise, use the setFieldProblematic function to complete the task,
using a problem type "ABSENT", and a description indicating why it is absent.
- [x] Function: setTaxBreakDownIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,12 @@
## TAX_TOTAL_AMOUNT
You must identify the total tax amount mentioned on the document.
Use the setCurrencyAmount function to complete the task.
If the total tax amount is not identifiable, use the setFieldProblematic function to complete the task,
using "ABSENT" as problem type, and providing a description.
- [x] Function: setCurrencyAmount
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,30 @@
## THIRDPARTY_FROM
You must identify the company or person (thirdparty) that is the EMITTER of this document
among values present on a backend.
- Identify whether the thirdparty is a company or a person.
- Identify the country of origin.
- Identify a unique identifier like VAT, enterprise number, email, IBAN, phone number, etc.
- If one or more unique identifiers are found, call the findThirdPartyByIdentifier function
to search for existing third parties on the backend using that information.
- If the thirdparty is found, use the setThirdPartyIdentification function to complete the task.
- If the thirdparty is not found, use the setFieldProblematic function to complete the task,
mentioning a problem type "THIRDPARTY_DOES_NOT_EXISTS" and all the identified values as description.
- Otherwise, use the findThirdPartyByName function to search for existing third parties on the backend
using non unique identifiers
- If the thirdparty is found, use the setThirdPartyIdentification function to complete the task.
- If the thirdparty is not found, use the setFieldProblematic function to complete the task,
mentioning a problem type "THIRDPARTY_DOES_NOT_EXISTS" and all the identified values as description.
- Otherwise, use the setFieldProblematic function to complete the task,
mentioning a problem type "THIRDPARTY_NOT_IDENTIFIABLE" and all the identified values as description.
- [x] Function: findThirdPartyByIdentifier
- [x] Function: findThirdPartyByName
- [x] Function: setFieldProblematic
- [x] Function: setThirdPartyIdentification
- [x] Active

View File

@ -0,0 +1,30 @@
## THIRPDARTY_TO
You must identify the company or person (thirdparty) that is the RECIPIENT of this document
among values present on a backend.
- Identify whether the thirdparty is a company or a person.
- Identify the country of origin.
- Identify a unique identifier like VAT, enterprise number, email, IBAN, phone number, etc.
- If one or more unique identifiers are found, call the findThirdPartyByIdentifier function
to search for existing third parties on the backend using that information.
- If the thirdparty is found, use the setThirdPartyIdentification function to complete the task.
- If the thirdparty is not found, use the setFieldProblematic function to complete the task,
mentioning a problem type "THIRDPARTY_DOES_NOT_EXISTS" and all the identified values as description.
- Otherwise, use the findThirdPartyByName function to search for existing third parties on the backend
using non unique identifiers
- If the thirdparty is found, use the setThirdPartyIdentification function to complete the task.
- If the thirdparty is not found, use the setFieldProblematic function to complete the task,
mentioning a problem type "THIRDPARTY_DOES_NOT_EXISTS" and all the identified values as description.
- Otherwise, use the setFieldProblematic function to complete the task,
mentioning a problem type "THIRDPARTY_NOT_IDENTIFIABLE" and all the identified values as description.
- [x] Function: findThirdPartyByIdentifier
- [x] Function: findThirdPartyByName
- [x] Function: setFieldProblematic
- [x] Function: setThirdPartyIdentification
- [x] Active

View File

@ -0,0 +1,12 @@
## TOTAL_AMOUNT
You must identify the total amount mentioned on the document, tax inclusive.
Use the setCurrencyAmount function to complete the task.
If the total amount is not identifiable, use the setFieldProblematic function to complete the task,
using "ABSENT" as problem type, and providing a description.
- [x] Function: setCurrencyAmount
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,12 @@
## TOTAL_AMOUNT_TAX_EXCLUSIVE
You must identify the total amount mentioned on the document, tax exclusive.
Use the setCurrencyAmount function to complete the task.
If the total tax exclusive amount is not identifiable, use the setFieldProblematic function to complete the task,
using "ABSENT" as problem type, and providing a description.
- [x] Function: setCurrencyAmount
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1,15 @@
## STRUCTURED_PAYMENT_REFERENCE
You must identify a non-structured payment reference present in the document, thus
NOT in the format "+++000/0000/00000+++".
Look for any other payment instructions and check the requested payment communication.
Use the setStringIdentification function to complete the task.
If the payment reference is not identifiable, use the setFieldProblematic function to complete the task,
using "ABSENT" as problem type, and providing a description.
- [x] Function: setStringIdentification
- [x] Function: setFieldProblematic
- [x] Active

View File

@ -0,0 +1 @@
This function creates create nitro field values from field requests.