The anatomy of a connector — what's inside the boxes you don't have to write

When we tell prospects that Aquifer ships “pre-built connectors,” there’s usually a beat of skepticism. They’ve heard that pitch before. The last vendor who said “pre-built” turned out to mean a thin REST wrapper that broke at the first edge case. So I want to walk through what’s actually inside one of our connectors — not the marketing version, the engineering version.

I’ll use Procore as the running example because it’s the connector we’ve spent the most cycles on, and because it surfaces every category of problem you have to solve.

Auth, paging, rate limits — the table stakes

The boring part. Procore uses OAuth 2.0 with three flows depending on the install type (company app, marketplace app, app-as-a-service). Our connector handles all three, refreshes tokens proactively, and degrades gracefully when a refresh fails (it doesn’t lose the in-flight sync; it pauses cleanly and emits an alert).

Paging is per-endpoint. Some endpoints page on offset, some on cursor, some on a Link header. The connector knows which one each endpoint uses and chooses the right strategy. It also knows that some endpoints lie about their total counts and adjusts.

Rate limits are tiered. Procore enforces per-app and per-company limits with separate quotas for read vs. write. Our connector tracks both, backs off when it sees a 429, and prioritizes writes over reads when they’re competing for budget.

None of this is interesting. All of it has to work, every time, or your integration is unreliable. We wrote it once and you don’t have to.

The schema layer — what makes pre-built actually work

This is where most “pre-built” connectors fall apart. They expose Procore’s raw API objects (with their underscored field names, nested arrays, and inconsistent nullability) and call it done. We do the next layer of work: a stable, documented, opinionated schema that every customer maps to, instead of mapping to Procore’s API directly.

The reason matters. Procore’s API changes — fields get added, deprecated, renamed every quarter. If your integration is mapped to the raw API, every quarter is a maintenance bill. If your integration is mapped to our schema, we absorb the change.

That schema work is genuinely opinionated. We had to make calls about how to handle Procore’s project lifecycle states, how to express committed-cost relationships in a normalized way, and what to do with custom fields. Those calls are visible — every customer sees the same schema, can review it, and can reason about what their data will look like.

Conflict resolution and idempotency

The hardest engineering work isn’t the moving of data; it’s what happens when both sides change the same record at roughly the same time. Procore is a system that humans actively edit. Sage is a system that humans actively edit. When a PM and an accountant touch the same change order in a five-minute window, the integration has to know what to do.

Our default rule for committed costs is “Sage wins on financial fields, Procore wins on operational fields.” That’s a value judgment, but it’s the right one for construction — and it’s configurable per customer, per object, per field.

Idempotency is the deeper requirement. Every sync is a re-run-able unit. If a sync fails halfway, the next one picks up cleanly without duplicates. That requires every write operation to be expressed as an upsert with a stable key, and every read to be staged before any write.

Observability and audit

Every record that moves through the connector is traceable. We log the source row, the destination row, the transformation applied, and the timestamp. If a project accountant asks “where did this number come from?”, the answer is one query away.

Schema-drift detection runs continuously. If Procore adds a new field to the change-order endpoint, we know within hours and surface it as a recommendation, not as a broken sync.

The customer side — what’s left

After all of that, what’s left for the customer to define? Three things:

The cost-code mapping. Their tree, our schema. Usually a half-day exercise.
The conflict overrides. Which way each field flows when both sides write. Usually a one-page document.
The sync cadence. Real-time, hourly, daily — driven by the workflow.

That’s it. Everything else — auth, paging, rate limits, schema stability, conflict resolution, idempotency, observability — is in the box. We wrote it once. You don’t have to.