Page 5 of 7

Challenge 4: Final Sprint¶

Recap:

In Challenge 3, you put the closed loop into practice (criteria, failing test, implement, passing test) and built features with confidence that your safety net would catch regressions. Your application is deployed, tested, and live. The pipeline works: Explore → Plan → Implement → Verify → Ship.

Then in Lesson 4, you confronted the remaining bottleneck: you. One story at a time. One conversation at a time. AI builds fast, but you're serial. You learned the delegation-ready test: can you spec it, is it bounded, is it independently buildable, would you know a good result? You practiced background execution, sending work to AI while you moved on to the next task. You planned your final sprint: user stories with acceptance criteria for every feature you want to ship, assessed for parallel safety.

You also saw the full system you've built: acceptance criteria define success (Lesson 1), decomposition keeps tasks bounded (Lesson 2), automated tests verify results (Lesson 3), and background execution lets you move faster (Lesson 4). That system is what lets you trust AI to work without watching every step.

The Challenge¶

This is the final sprint. Everything you've learned (delegation contracts, skills, TDD, parallel execution) comes together. Ship as much of your analytical backlog as you can by running multiple workstreams, batching similar work, and trusting your system to verify the results.

The Dark Vessel Risk Assessment Tool you demo at the end of this challenge is the culmination of all four challenges. Make it something worth showing.

Delegate with judgment. Trust your tests. Ship when they're green.

Remaining Data Sources¶

Challenge 4 opens the final two data files from your repository:

Piracy incidents (piracy-incidents.json): The National Geospatial-Intelligence Agency (NGA) maintains the Anti-Shipping Activity Messages (ASAM) archive, a public database of geocoded piracy reports, hostile actions, and suspicious activity at sea. Your repository includes incidents covering 2020 to 2024 across the same maritime regions as your AIS data. This data enables the "piracy plausibility test": when a vessel goes dark, check the piracy threat level for that region. The Gulf of Guinea and parts of Southeast Asia have active piracy threats, so vessels may legitimately disable AIS as a safety precaution (the IMO recognizes this). The Mediterranean has near-zero piracy, so a vessel going dark there has no legitimate safety reason. Dark fleet operators know this and exploit the piracy exception as cover.
Flag state performance list (flag-state-wgb-list.csv): The Paris MOU is an agreement among maritime authorities from 27 countries that coordinates port state control inspections. They rate every flag state by inspection results, categorizing them as White (low risk), Grey (medium risk), or Black (high risk). Vessels registered under black-list flags (Cameroon, Tanzania, Comoros, Togo) have weaker oversight and are disproportionately represented in the dark fleet. A vessel reflagging from a white-list to a black-list state is a sanctions evasion indicator. Your repository includes the current list so you can cross-reference it against vessel flags in your data.

As with earlier challenges, take the Explore step before building: "What is the piracy plausibility test and how does piracy threat context change the interpretation of an AIS gap? How does flag state performance scoring work?"

What to Build¶

Baseline CapabilitiesStretch Goals

Items are listed in priority order. If time is tight, focus on the items near the top first.

Delegation-ready assessment applied: your team has evaluated the sprint backlog against the delegation-ready test, decided what to run in parallel vs. sequence, and is executing the plan you built in Lesson 4
Piracy plausibility context: when the analyst views a vessel's AIS gap history, they can see the piracy threat level for the region where each gap occurred. A vessel that went dark in the Mediterranean (near-zero piracy) has no legitimate safety reason. A vessel that went dark in the Gulf of Guinea (high piracy) is ambiguous. This context changes the analytical interpretation of the gap.
Destination analysis: each vessel's AIS static data includes a stated destination. The analyst can compare that against the vessel's actual position and heading to spot deceptive behavior. A vessel stating "FUJAIRAH" as its destination while heading in the opposite direction at 12 knots is worth investigating. The World Port Index (world-port-index.json) provides port locations for this comparison. Several sanctioned vessels in your data are heading away from their stated ports, and two are broadcasting different destinations under different names on the same transponder.
At least two features built in parallel: separate conversations, background execution, or batched similar work. Each feature has its own user story, acceptance criteria, and tests, independently built and independently verified.
Demo polish: landing page that explains what the tool does and who it's for, professional styling, clear navigation between traffic display, vessel investigation, and analytical findings. Make the live URL something you'd demo with confidence.
Final deployment: the live URL reflects the complete Dark Vessel Risk Assessment Tool: four challenges of work, tested and shipped. All tests pass.

These are options for teams that finish the baseline capabilities. Your team can also define your own stretch goals based on what interests you. Use the Explore step to brainstorm: ask your AI coding assistant about the analyst's workflow, research how intelligence products are structured, and think about what would make this tool genuinely useful on a watch floor. If you finished earlier challenges without completing all their stretch goals, consider going back to pick up features from those lists as well.

Composite risk scoring: combine all signals (sanctions match, gap history, vessel profile, satellite correlation, flag state risk, piracy context) into a single risk score. No single indicator is conclusive. But sanctions match plus repeated intentional AIS gaps plus no insurance plus black-list flag plus unmatched satellite detection in a low-piracy zone is a high-confidence finding. The analyst can triage from highest risk to lowest.
Flag state risk layer: cross-reference each vessel's flag against the Paris MOU White/Grey/Black list. A vessel flagged in a black-list state has weaker regulatory oversight and higher detention rates. A vessel that recently reflagged from a white-list to a black-list state is actively reducing its regulatory exposure.
Behavioral anomaly detection: surface vessels exhibiting suspicious movement patterns. Your data contains vessels that stop mid-voyage (possible ship-to-ship cargo transfer), vessels whose AIS goes silent for hours and then reappears (going dark), and vessels whose reported speed or position does not match their actual movement. Two sanctioned vessels in your data broadcast under two different names from locations thousands of miles apart on the same transponder. These patterns are discoverable if you look for them.
AI-synthesized intelligence summary: use your AI coding assistant to generate a plain-language summary of findings for the highest-risk vessels. Given a vessel's sanctions status, gap history, profile indicators, satellite correlation, and piracy context, produce a concise analytical paragraph that an incoming analyst or partner agency (OFAC, Fleet commanders) can read and immediately understand the risk picture. This is where AI becomes part of the product itself, not just the development tool.
Prioritized intelligence product: generate a report listing the highest-risk vessels with supporting evidence from every data source. Structure it for distribution: an incoming analyst reads it and immediately knows what needs attention, an OFAC analyst can use it to inform enforcement decisions, and a Fleet commander can use it to allocate monitoring resources.

Final Growth Check-in

There is a final growth check-in on the next page. Make sure to navigate there and fill it out before the final reflection begins.

Tips

Start with the sprint plan you made in Lesson 4. Your team already identified which features pass the delegation-ready test and assessed parallel safety. Execute the plan. Don't re-plan.
Your tests are what make background execution safe. You don't need to watch AI build. Send a feature to the background, move on to your next story, and when the background task finishes, run the tests. Green means it worked. Red tells you exactly what to fix. The acceptance criteria you wrote define done; the automated tests you built in Challenge 3 verify it. That's the system working for you.
Check the foundation before you build on it. Three challenges of features may have left your codebase with inconsistencies or data-handling patterns that vary across files. Before the sprint, ask AI: "Look at our codebase. Are there any files that are too large, duplicated patterns, or things you'd reorganize before we add more features?" Cleaning up now is faster than untangling it later.
Pipeline as a team, not four builders at once. Your team has four workspaces, but that doesn't mean four people implementing simultaneously. While one person builds and syncs a feature, teammates can explore the piracy data and flag state list, write user stories and acceptance criteria for the next batch, or review what was just deployed. Keep the pipeline full: Explore, Plan, Implement, Verify happening in parallel across the team, not four implementations racing to sync.
Let the tests do the integration check. After syncing a completed feature, run the full test suite. If everything passes, the feature integrates cleanly. If something fails, you know exactly what broke, and you fix that, not everything.
If syncing fails with a conflict, your AI assistant will usually try to resolve it automatically. If it doesn't, tell it: "I have a merge conflict. Help me resolve it." Conflicts happen when two people changed the same file. They're a normal part of parallel work, not an error on your part. After resolving, run the full test suite to make sure everything still works.
Redeploy often. Every time a batch of tests goes green, ship it. The live URL should reflect your latest verified work throughout the sprint, not just at the end.
Take breaks. AI does the building, but you're doing the thinking: evaluating output, making decisions, coordinating with teammates. That's genuinely tiring. If your judgment starts slipping, step away for five minutes. A rested delegator makes better calls than a rushed one.