RFP Ingestion Skill
Overview
This skill helps implement multi-source RFP data ingestion with canonical schema normalization and deduplication.
Supported Data Sources
| Source | Priority | API Type | Rate Limits |
|---|
| SAM.gov | P1 | REST API | 10 req/sec, 10k/day |
| Maryland eMMA | P1 | Web scraping | Respectful crawling |
| RFPMart API | Current | REST API | As documented |
| RFPMart CSV | Current | Manual upload | N/A |
| GovTribe | P2 | REST API (paid) | Per subscription |
CSV Upload (RFPMart Email Alerts)
RFPMart sends periodic email alerts with CSV attachments. These can be manually uploaded through the Admin UI.
| Column | Index | Content | Example |
|---|
| ID | 0 | RFP identifier | SW-82097 |
| Country | 1 | Country code | USA |
| State | 2 | State name | Idaho |
| Title | 3 | Full title with location | SW-82097 - USA (Idaho) - Data Concealment... |
| Deadline | 4 | Due date | March 25,2026 |
| URL | 5 | RFPMart link | https://www.rfpmart.com/... |
ID Prefix → Category Mapping
typescript
1const categoryMap: Record<string, string> = {
2 SW: "Software Development",
3 ITES: "IT Services",
4 NET: "Networking",
5 TELCOM: "Telecommunications",
6 DRA: "Data & Research",
7 CSE: "Security Services",
8 HR: "Human Resources",
9 PM: "Project Management",
10 MRB: "Marketing & Branding",
11 // ... other prefixes default to "Other"
12};
IT-Relevant Prefixes
When filtering for IT-relevant RFPs only, these prefixes are included:
SW - Software Development
ITES - IT Services
NET - Networking
TELCOM - Telecommunications
DRA - Data & Research
CSE - Security Services
Key Files
| File | Purpose |
|---|
convex/ingestion/rfpmartCsv.ts | CSV parser and Convex action |
components/admin/CsvUpload.tsx | Drag-and-drop upload UI |
Usage
- Navigate to Admin → Data Sources tab
- Scroll to RFPMart CSV Upload section
- Drop a CSV file or click to browse
- Toggle "Only import IT-relevant RFPs" if desired
- View results summary (new/updated/skipped/errors)
Implementation Example
typescript
1// Parsing CSV with quoted fields
2function parseCSVLine(line: string): string[] {
3 const fields: string[] = [];
4 let current = "";
5 let inQuotes = false;
6
7 for (let i = 0; i < line.length; i++) {
8 const char = line[i];
9 if (char === '"') {
10 if (inQuotes && line[i + 1] === '"') {
11 current += '"';
12 i++;
13 } else {
14 inQuotes = !inQuotes;
15 }
16 } else if (char === "," && !inQuotes) {
17 fields.push(current);
18 current = "";
19 } else {
20 current += char;
21 }
22 }
23 fields.push(current);
24 return fields;
25}
Canonical Schema
All sources must normalize to this schema:
typescript
1interface Opportunity {
2 externalId: string; // Source-specific ID
3 source: "sam.gov" | "emma" | "rfpmart" | "govtribe";
4 title: string;
5 description: string;
6 summary?: string;
7 location: string;
8 category: string;
9 naicsCode?: string;
10 setAside?: string; // "Small Business", "8(a)", etc.
11 postedDate: number; // Unix timestamp
12 expiryDate: number; // Unix timestamp
13 url: string;
14 attachments?: Attachment[];
15 eligibilityFlags?: string[];
16 rawData: Record<string, unknown>;
17 ingestedAt: number;
18}
SAM.gov Integration
API Endpoint
https://api.sam.gov/opportunities/v2/search
typescript
1{
2 "Accept": "application/json",
3 "X-Api-Key": process.env.SAM_GOV_API_KEY
4}
Example Query
typescript
1const params = new URLSearchParams({
2 postedFrom: "2024-01-01",
3 postedTo: "2024-12-31",
4 limit: "100",
5 offset: "0",
6 ptype: "o", // Opportunities only
7});
Field Mapping
| SAM.gov Field | Canonical Field |
|---|
noticeId | externalId |
title | title |
description | description |
postedDate | postedDate (parse to timestamp) |
responseDeadLine | expiryDate (parse to timestamp) |
placeOfPerformance.state | location |
naicsCode | naicsCode |
setAsideDescription | setAside |
Convex Implementation
Ingestion Action
typescript
1// convex/ingestion.ts
2import { action, internalMutation } from "./_generated/server";
3import { v } from "convex/values";
4import { internal } from "./_generated/api";
5
6export const ingestFromSam = action({
7 args: { daysBack: v.optional(v.number()) },
8 handler: async (ctx, args) => {
9 const apiKey = process.env.SAM_GOV_API_KEY;
10 if (!apiKey) throw new Error("SAM_GOV_API_KEY not configured");
11
12 const fromDate = new Date();
13 fromDate.setDate(fromDate.getDate() - (args.daysBack ?? 7));
14
15 const response = await fetch(
16 `https://api.sam.gov/opportunities/v2/search?` +
17 `api_key=${apiKey}&postedFrom=${fromDate.toISOString().split("T")[0]}&limit=100`,
18 { headers: { Accept: "application/json" } }
19 );
20
21 if (!response.ok) {
22 throw new Error(`SAM.gov API error: ${response.status}`);
23 }
24
25 const data = await response.json();
26 let ingested = 0;
27 let updated = 0;
28
29 for (const opp of data.opportunitiesData ?? []) {
30 const result = await ctx.runMutation(internal.rfps.upsert, {
31 externalId: opp.noticeId,
32 source: "sam.gov",
33 title: opp.title ?? "Untitled",
34 description: opp.description ?? "",
35 location: opp.placeOfPerformance?.state ?? "USA",
36 category: opp.naicsCode ?? "Unknown",
37 postedDate: new Date(opp.postedDate).getTime(),
38 expiryDate: new Date(opp.responseDeadLine).getTime(),
39 url: `https://sam.gov/opp/${opp.noticeId}/view`,
40 rawData: opp,
41 });
42
43 if (result.action === "inserted") ingested++;
44 else updated++;
45 }
46
47 // Log ingestion
48 await ctx.runMutation(internal.ingestion.logIngestion, {
49 source: "sam.gov",
50 status: "completed",
51 recordsProcessed: data.opportunitiesData?.length ?? 0,
52 recordsInserted: ingested,
53 recordsUpdated: updated,
54 });
55
56 return { ingested, updated, source: "sam.gov" };
57 },
58});
Upsert Mutation
typescript
1// convex/rfps.ts (internal mutation)
2export const upsert = internalMutation({
3 args: {
4 externalId: v.string(),
5 source: v.string(),
6 title: v.string(),
7 description: v.string(),
8 location: v.string(),
9 category: v.string(),
10 postedDate: v.number(),
11 expiryDate: v.number(),
12 url: v.string(),
13 rawData: v.optional(v.any()),
14 },
15 handler: async (ctx, args) => {
16 const existing = await ctx.db
17 .query("rfps")
18 .withIndex("by_external_id", (q) =>
19 q.eq("externalId", args.externalId).eq("source", args.source)
20 )
21 .first();
22
23 const now = Date.now();
24
25 if (existing) {
26 await ctx.db.patch(existing._id, { ...args, updatedAt: now });
27 return { id: existing._id, action: "updated" as const };
28 }
29
30 const id = await ctx.db.insert("rfps", {
31 ...args,
32 ingestedAt: now,
33 updatedAt: now,
34 });
35 return { id, action: "inserted" as const };
36 },
37});
Deduplication Strategy
- Exact match:
externalId + source combination
- Title similarity: Fuzzy match titles within same deadline window
- URL canonicalization: Normalize URLs before comparison
Eligibility Pre-Filtering
Detect disqualifiers during ingestion:
typescript
1const DISQUALIFIER_PATTERNS = [
2 { pattern: /u\.?s\.?\s*(citizen|company|organization)\s*only/i, flag: "us-org-only" },
3 { pattern: /onshore\s*(only|required)/i, flag: "onshore-required" },
4 { pattern: /on-?site\s*(required|mandatory)/i, flag: "onsite-required" },
5 { pattern: /security\s*clearance\s*required/i, flag: "clearance-required" },
6 { pattern: /small\s*business\s*set[- ]aside/i, flag: "small-business-set-aside" },
7];
8
9function detectEligibilityFlags(text: string): string[] {
10 return DISQUALIFIER_PATTERNS
11 .filter(({ pattern }) => pattern.test(text))
12 .map(({ flag }) => flag);
13}
Scheduled Ingestion
typescript
1// convex/crons.ts
2import { cronJobs } from "convex/server";
3import { internal } from "./_generated/api";
4
5const crons = cronJobs();
6
7crons.interval(
8 "ingest-sam-gov",
9 { hours: 6 },
10 internal.ingestion.ingestFromSam,
11 { daysBack: 3 }
12);
13
14export default crons;
Error Handling
| Error Type | Action |
|---|
| Rate limit (429) | Exponential backoff, retry after delay |
| Auth error (401/403) | Log error, alert admin |
| Server error (5xx) | Retry up to 3 times |
| Parse error | Log raw data, skip record |
Testing Approach
- Mock API responses for unit tests
- Use sandbox/test endpoints when available
- Validate schema transformation
- Test deduplication logic
- Verify eligibility flag detection