Blog Shipping strategy

What to Centralize and How: A Field-Level Guide to Logistics Data for Shippers and 3PLs

Logistics data field mapping — shipment records, invoice lines, and rate data connected into one working dataset for freight cost analysis

What you'll get from this article

  • Know exactly which fields to collect for each dataset. Not just "shipment data" — the specific fields in the shipment record, the invoice line record, and the rate record that make freight cost questions answerable.
  • Understand how to link invoices to shipments in practice. The matching key problem is where most data projects stall. Here is a priority order for linking, and what to do when the obvious key is missing.
  • Clean up reference data before it breaks the analysis. The same carrier appearing under four different names across five source files kills every query. Here is what to normalize and how.
  • Build exception flags that surface real cost problems. Specific rules and thresholds — not "flag variances," but what condition, what threshold, and what action each flag should drive.
  • See what the finished master record looks like. A working field structure that connects the shipment, the invoice, and the rate agreement in one row.
  • Understand what 3PLs need to add. Additional fields for margin visibility and customer billing that shippers do not need but 3PLs cannot skip.

The decision to centralize logistics data is easy. The harder question is: exactly what fields go in each dataset, how do you connect them, and what do you build on top? This article answers those questions at the level of the work itself.

If you are looking for the case for why centralizing logistics data matters in the first place, start with Your Freight Invoices Are Too Expensive — But First, Centralize the Data. This article picks up where that one ends: you have decided to build the dataset. Now what, specifically, does that mean?

The shipment record: the backbone everything else connects to

Every other dataset links back to the shipment. If the shipment record is incomplete or inconsistently structured, every downstream analysis inherits the problem. Before collecting anything else, get the shipment record right.

The shipment record answers the most basic question in freight cost analysis: what actually moved? Without it, the invoice is a charge with no context. With it, every charge can be matched to a specific movement, a specific carrier, a specific customer, and a specific lane.

Field What it captures Why it matters
Shipment ID Your internal reference for this movement Primary key — every other record links to this
Order number The customer order that generated the shipment Connects freight cost back to the sale and customer
Carrier PRO / tracking number The carrier's own reference for the shipment Required to match against carrier invoices and tracking events
Bill of lading number The shipping document reference Fallback matching key when PRO is missing from invoice
Carrier name (normalized) The carrier used for this movement Must be consistent across all records — see reference data section
Service level code Standard ground, expedited, LTL, etc. Required to validate that the billed service matches what was tendered
Origin postal code / city Where the shipment was picked up Lane definition and zone calculation
Destination postal code / city Where the shipment was delivered Lane definition, zone calculation, and residential flag
Location type (destination) Commercial, residential, limited access Determines which accessorial surcharges are expected vs unexpected
Customer ID Which customer this shipment belongs to Required to analyze freight cost by customer
Tendered weight The weight recorded at the warehouse when the shipment was booked Baseline for spotting carrier reweigh adjustments on the invoice
Number of pieces / pallets Piece count or pallet count LTL pricing is often per-hundredweight; piece count affects minimum charges
Freight class (LTL) NMFC class used at booking Baseline for spotting carrier reclass adjustments; see LTL reweigh and reclass charges explained
Scheduled pickup date The date the carrier was expected to pick up Identifies late pickups that may trigger re-pickup or delay fees
Actual pickup date The date the carrier actually picked up Identifies missed pickup windows
Actual delivery date The date the shipment was delivered On-time delivery tracking; fuel surcharge effective date calculation
Quoted freight cost The rate that was shown at booking Foundation for quote-to-invoice variance tracking

A common gap: teams collect shipment header data (carrier, origin, destination, dates) but skip tendered weight, freight class, and quoted cost. Those three fields are required for any meaningful audit. Without tendered weight, you cannot detect a reweigh. Without freight class, you cannot detect a reclass. Without quoted cost, invoice variance is invisible.

The invoice record: line level is not optional

Invoice data collected as monthly totals — even by carrier — cannot support freight cost analysis. The cost story lives inside individual charge lines, not in the summary. A carrier invoice for $420 tells you nothing. A carrier invoice showing $280 base freight, $62 fuel surcharge, $48 liftgate delivery, and $30 appointment fee tells you four separate things, each with a different explanation and a different owner.

The invoice record requires one row per charge line, not one row per invoice. If your source data arrives as invoice-level totals, push back before accepting it. The line-level breakdown is not more data for its own sake — it is the only level at which accessorial patterns, rate discrepancies, and billing errors are visible.

Field What it captures Notes
Invoice number The carrier's invoice reference Required for duplicate detection — same invoice number from the same carrier on two records is a flag
Invoice date The date the carrier issued the invoice Used to match the correct fuel surcharge table and rate agreement effective period
Carrier PRO / tracking number The carrier's shipment reference on this invoice Primary link back to the shipment record
Charge type Base freight, fuel surcharge, liftgate, appointment, reweigh adjustment, residential, re-delivery, etc. Must be categorized consistently — not as "miscellaneous" or "other"
Charge amount The dollar amount for this specific charge line One amount per charge type per row
Billed weight The weight the carrier used to calculate the freight charge Compare against tendered weight in shipment record to detect reweigh
Billed freight class (LTL) The class the carrier used for rating Compare against tendered class in shipment record to detect reclass
Carrier name (normalized) Which carrier issued this invoice Must match the normalized carrier name in the shipment record exactly
Invoice currency CAD, USD, or other Required for accurate cost comparison across carriers on cross-border lanes
Credit or adjustment flag Whether this line is a charge or a credit Credits need to be matched to the original invoice they correct

One recurring problem: carrier invoices often arrive with charge types labeled inconsistently. One carrier calls it "fuel surcharge," another calls it "energy surcharge," a third buries it inside "accessorial fees." Before analysis is possible, these need to be mapped to a consistent internal charge type taxonomy. Building that mapping table once — carrier by carrier — saves hours of confusion every month going forward.

The rate record: what should have been charged

Without rate data, invoice data can only tell you what was charged. It cannot tell you whether what was charged was correct. Rate data is the baseline that turns every invoice line into a specific question: was this calculated correctly against the agreement that was in place when this shipment moved?

Rate agreements are typically stored as carrier PDFs or spreadsheet exports. The work of centralization is translating those documents into queryable records with effective dates so the right rate row applies to the right shipment.

Field What it captures Notes
Carrier name (normalized) Which carrier this rate applies to Must match exactly across all records
Service level Standard ground, expedited, LTL standard, etc. Rates vary by service — a single carrier may have five different rate tables
Origin zone or postal range Where the shipment originates Rate tables are often zone-based, not lane-specific
Destination zone or postal range Where the shipment is going Zone-to-zone matrix is the most common format
Weight break The weight range this rate row covers Rates typically step down per-unit as weight increases
Base rate The agreed linehaul rate for this zone/weight combination May be expressed as a flat amount or a rate per 100 lbs (cwt)
Discount percentage (if applicable) The negotiated discount applied to the carrier's published tariff Many contracts are structured as "X% off tariff" rather than absolute rates
Minimum charge The floor below which no shipment is billed Light shipments on short lanes frequently hit the minimum — this needs to be modeled, not ignored
Fuel surcharge table reference Which fuel index and rate table governs fuel for this agreement Fuel surcharge is recalculated weekly or monthly — the table and effective period must be captured
Accessorial rate by type The agreed rate for each accessorial charge Liftgate, appointment, residential, wait time, limited access — each needs its own agreed rate row
Effective date (start) The date this rate came into effect Critical — without this, the wrong rate row is applied to historical shipments
Effective date (end) The date this rate was superseded Open-ended for the current agreement; filled when a new agreement replaces it

The effective date pair is the most commonly skipped field and the one that causes the most analysis errors. A rate that changed in March should not be applied to January shipments. If the effective dates are not captured, every historical comparison is potentially wrong.

The linking problem: how to connect invoices to shipments

The most common place a data centralization effort stalls is the match between invoice and shipment. The two records come from different systems, use different reference numbers, and were created by different teams at different moments in time. Getting them to connect reliably is not automatic.

Use this priority order when matching.

Priority Matching key Reliability Common problems
1 Carrier PRO number — match between invoice PRO and shipment PRO High PRO not captured in TMS at booking; PRO assigned after pickup and not updated in source system
2 Bill of lading number — match between invoice BOL and shipment BOL Medium–high BOL number formats vary; some carriers use the customer's BOL, some assign their own
3 Order number — match between invoice reference field and order number in shipment record Medium Not all carriers include order number on invoice; format may differ (leading zeros, prefixes)
4 Composite key — carrier + destination + pickup date + weight range Low–medium Multiple shipments to same destination on same date create ambiguous matches
Unmatched Invoice with no shipment match after all keys exhausted Flag for manual review — may be duplicate invoice, ghost shipment, or data entry error

Every invoice that cannot be matched to a shipment after all four options are exhausted should be flagged before payment. A carrier invoice with no corresponding shipment in your system is either a billing error, a duplicate, or a shipment that was not captured in the source data. None of those should be paid without investigation.

The match rate — the percentage of invoices that link to a shipment record — is itself a useful metric. A match rate below 90 percent in the first month of data work is common. Below 75 percent means there is a source data problem worth investigating before building anything on top.

Reference data: the foundation that makes everything else work

Reference data is the set of standard values that allow records from different sources to be compared. When two records use different values for the same thing — "UPS" versus "United Parcel Service" versus "UPS Ground," or "residential" versus "resi" versus "home delivery" — every query that groups by carrier or location type produces wrong results.

Reference data cleanup is unglamorous work. It is also the work that determines whether every analysis that follows produces numbers you can trust.

Four reference categories need standardization before analysis is possible.

Carrier names. Pull every unique value in the carrier field across all source files. Map each variant to a single canonical name. Common problems: "FedEx Freight" and "FedEx LTL" and "Federal Express" appearing as three separate carriers; regional carriers appearing under both their own name and the name of a parent company they were acquired by. Build a mapping table — raw value on one side, canonical name on the other — and apply it every time data is loaded.

Service level codes. Different systems use different labels for the same service. Your TMS may call it "Ground Residential." The carrier's invoice may call it "Home Delivery." Your rate agreement may define it as "SVC-R." All three need to resolve to the same internal code. Without this, service-level analysis produces nonsense.

Location types. The residential vs commercial distinction drives a significant share of accessorial charges. The location type field in the shipment record needs to be a controlled value — not free text — and it needs to be accurate at booking, not corrected after the carrier charges the surcharge. Common values: Commercial Dock, Residential, Limited Access, Trade Show, Construction Site, Government Facility. If this field is empty or inconsistently populated, residential surcharge analysis is impossible.

Customer and location identifiers. The same delivery address may appear as "Costco Warehouse — Mississauga" in one system and "Costco — 2150 Argentia Road" in another. If location identifiers are not standardized, accessorial analysis by delivery location does not work. Build a location master — canonical name, address, postal code, location type — and use the location ID rather than free-text address fields as the analytical key.

Additional fields for 3PLs: the margin layer

A shipper needs to know what freight cost and whether it was correct. A third-party logistics provider needs to know all of that plus what was billed to the customer for the same movement — and whether carrier cost plus margin is running above or below what was charged.

Without the customer billing fields alongside the carrier cost fields, a 3PL can only see cost, not profitability. Accounts that look active can be running at a loss, and the problem is only visible after the accounting cycle closes.

Field What it captures Why it matters for 3PLs
Customer billing rate The rate the 3PL charged the customer for this movement Revenue side of the margin calculation
Customer billing amount (total) What the customer was actually invoiced for this shipment Including any accessorials passed through to the customer
Accessorial pass-through flag Whether each accessorial was billed to the customer or absorbed by the 3PL Identifies which charge types are being systematically absorbed and at what cost
Carrier cost (total for shipment) What the carrier charged the 3PL for this movement Cost side of the margin calculation
Gross margin per shipment Customer billing amount minus carrier cost Calculated field — the core profitability metric at the shipment level
Customer account ID Which 3PL customer this shipment belongs to Required to roll up margin by account and surface accounts running below target
Customer rate agreement effective date Which customer billing agreement was in effect at time of shipment Required to validate that the customer was billed under the correct rate agreement

The most common 3PL margin problem I see: a carrier implements a rate increase or adds a new accessorial, the 3PL's carrier cost goes up, but the customer billing rate is not updated because nobody connected the carrier invoice changes to the customer rate review process. Margin on those accounts compresses silently until someone runs the numbers. With the margin fields in the dataset, that compression is visible in real time rather than after the quarter closes.

Exception flags: what to build and what thresholds to use

Exception flags are where the connected dataset becomes action. A flag is a rule applied to the data that surfaces a record requiring review. The goal is not to flag everything — that creates noise. The goal is to flag the specific conditions that represent real cost problems, billing errors, or operational failures.

The following flags cover the majority of recoverable freight cost issues. Each one needs a defined condition, a threshold, and a clear next step.

Flag Condition Suggested threshold Next step
Duplicate invoice Same invoice number from the same carrier appears more than once Any duplicate — no tolerance Hold second instance; request carrier confirmation before payment
Unmatched invoice Invoice has no matching shipment record after all four matching keys are exhausted Any unmatched — no tolerance Hold for manual review; do not pay until shipment is confirmed
Rate discrepancy Billed base rate does not match the calculated rate from the agreed rate file for this carrier, zone, weight, and effective date Variance greater than $10 or 3% of base charge, whichever is larger Flag for dispute; document the correct rate and the actual billed rate
Carrier reweigh Billed weight on invoice is higher than tendered weight on shipment record Any positive difference; prioritize review where weight difference exceeds 10% or $25 Request carrier reweigh documentation; compare against warehouse scale record
Carrier reclass (LTL) Billed freight class on invoice does not match tendered freight class on shipment record Any mismatch Verify correct NMFC classification; dispute if tendered class was correct
Unexpected residential surcharge Residential delivery charge billed for a location flagged as commercial in the shipment record Any mismatch Confirm delivery address type; dispute if address is confirmed commercial
Quote-to-invoice variance Total invoiced amount is higher than quoted amount on shipment record Variance greater than $30 or 10% of quoted amount Identify which charge types created the variance; classify as expected, explainable, or disputed
Accessorial concentration The same accessorial charge type appears more than N times for the same delivery location in a rolling 90-day window Five or more occurrences of the same charge type at the same location Review whether the charge reflects a permanent location characteristic that should be flagged at booking
3PL margin below threshold Gross margin per shipment (customer billing minus carrier cost) falls below target Below agreed margin floor for that customer account or lane Review whether carrier cost increase was passed through to customer billing

A few practical notes on thresholds. Start conservative — flag fewer things with higher confidence, not everything with low confidence. A team that receives 300 flags per week and investigates none of them has the same outcome as a team with no flags. The threshold should match the investigation capacity available. Adjust over time as the team develops a sense of which flags are producing genuine recoveries and which are noise.

What the master record looks like

Once the shipment record, invoice record, and rate record are connected and the exception flags are applied, the result is a master record — one row per invoice charge line, linked to the shipment, with the rate baseline and exception status alongside it.

This is what the structure looks like in practice.

Category Key fields in this row
Shipment identity Shipment ID, order number, carrier PRO, BOL number
Carrier and service Carrier name (normalized), service level code
Lane Origin postal code, destination postal code, destination location type
Customer Customer ID, delivery location ID
Shipment facts Tendered weight, freight class, piece count, pickup date, delivery date, quoted amount
Invoice line Invoice number, invoice date, charge type (normalized), charge amount, billed weight, billed class
Rate baseline Expected rate for this charge type, variance amount, variance percentage
Exception flags Duplicate flag, unmatched flag, rate discrepancy flag, reweigh flag, reclass flag, residential mismatch flag, quote variance flag
3PL fields (if applicable) Customer billing amount, carrier cost, gross margin, accessorial pass-through flag

One row in this structure contains everything needed to answer whether a charge was correct, whether it was expected, and who should act on it. A query that filters for any row where a flag is active gives the exception queue. A query that groups by carrier and charge type gives the cost breakdown. A query that groups by customer and location gives the accessorial pattern by delivery point.

The four-week build: specific tasks and deliverables

The following plan assumes you are starting from scratch — carrier invoices in email or accounting software, shipment data in a TMS or spreadsheet, rate agreements in PDFs. Adjust based on what source data you already have in accessible form.

Week 1 — Source inventory and invoice extraction. List every carrier, every source system, and every data format. For each carrier, identify how invoices arrive: email attachment, portal download, EDI file, or accounting system export. Pull three months of invoices from each carrier. For each invoice, extract to a spreadsheet with one row per charge line. Build the charge type mapping table — what each carrier calls each charge type, mapped to your internal taxonomy. Deliverable: a flat file per carrier, three months of invoice lines, charge types normalized.

Week 2 — Shipment record extraction and invoice matching. Export shipment records for the same three-month window from your TMS, WMS, or order management system. Confirm which fields are populated and which are missing. Attempt the invoice-to-shipment match using the priority key order above. Calculate your match rate. For unmatched invoices, document the reason: missing PRO in TMS, no order reference on invoice, or no shipment record at all. Deliverable: matched invoice-shipment pairs, unmatched invoice list with reason codes, initial match rate metric.

Week 3 — Rate data and reference cleanup. Translate carrier rate agreements into a queryable rate table with effective dates. Build the carrier name mapping table and apply it to all three source files. Standardize service level codes and location type values. Link each matched shipment-invoice pair to the applicable rate row and calculate the expected charge. Flag rows where the billed amount does not match the expected amount above the threshold. Deliverable: rate baseline applied to all matched records, rate discrepancy flags active, reference data normalized.

Week 4 — Exception view and first review cycle. Apply all exception flags from the table above. Generate the exception queue — all flagged rows, sorted by variance amount. Run the first review cycle: for each flagged record, classify as confirmed error (dispute), operational explanation (document and accept), or requires more information. For the accessorial concentration flag, identify which delivery locations appear most frequently and whether the charge type reflects a permanent location characteristic that should be pre-flagged at booking. Deliverable: first live exception queue, first set of dispute candidates, first view of accessorial patterns by location.

What not to build before the data is clean

Two things are consistently attempted before the data is ready and produce results that waste more time than they save.

Dashboards on dirty data. A chart showing freight cost by carrier built on inconsistent carrier names will show the same carrier split across three bars with three different labels. The bar chart looks finished. The numbers are wrong. Clean the carrier name reference table first, then build the visualization.

AI or automation on unlinked records. Any tool that reads invoice data to detect anomalies, suggest disputes, or flag duplicates requires the invoice to be linked to the shipment. An AI that reads an invoice line in isolation cannot tell whether the liftgate charge was expected or unexpected — it does not know whether the delivery location was flagged as commercial. Connect the records first. The automation or analysis layer comes after.

For more on how logistics data reaches shippers in the first place — and why different carriers deliver it in different formats — see Why is logistics data so hard to get? EDI vs API explained.

The dataset is the foundation; the analysis comes after

The work described in this article is not glamorous. It is field mapping, reference table building, key matching, and exception rule writing. It does not produce a visible output in week one. It does not immediately answer the question of why freight cost went up last month.

But it is the work that makes every other effort possible. Without the specific fields in the shipment record, reweigh disputes are guesswork. Without the invoice at the line level, accessorial patterns are invisible. Without the rate record with effective dates, the audit compares charges to the wrong baseline. Without the reference data normalized, every query groups incorrectly. Without the exception flags, the review cycle is a manual search through thousands of rows.

Four weeks of careful data work produces a connected dataset that can be queried, audited, and acted on. It does not require a new system. It requires knowing what fields to collect, how to connect them, and what rules to apply on top.

That is what freight cost control actually looks like at the data level.

The questions are easy to ask. It is the data structure underneath that determines whether they can be answered.

If you are building this dataset and want a second opinion on structure, field choices, or exception logic, feel free to reach out at dzhang@stridedata.com.