Analytiq Hub

Navigating Compliance in Startups: A Guide for Regulated Industries

2025-09-10T00:00:00+00:00

In the dynamic world of startups, especially those operating in heavily regulated sectors like healthcare, fintech, AI, and consumer hardware, compliance isn’t just a checkbox—it’s a foundational element that safeguards innovation, protects users, and unlocks growth opportunities.

As regulations evolve rapidly — with frameworks like HIPAA, SOC2, FDA approvals, and GDPR becoming more stringent — startups must integrate compliance from the outset to avoid costly pitfalls.

In this blog post, we draw from insights shared at Startup Boston Week 2025’s Engineering with Guardrails panel, featuring Ilsa Webeck (Simbex), Marjan Monfared (Leaf Guardian), and myself (DocRouter.AI), moderated by Rabeeh Majidi (OrthoKinetic Track) - incorporating panel wisdom alongside other general best practices.

Whether you’re a technical founder or CTO, understanding these principles can help you build resilient products that scale without regulatory roadblocks.

How Do Regulatory Frameworks Differ From Each Other?

While all regulatory frameworks aim to promote safety, security, and accountability, it’s vital to understand their key differences, as advice applicable to one may not translate to others.

HIPAA is a U.S. federal law mandating the protection of protected health information (PHI) through privacy and security rules, including breach notifications, for covered entities like healthcare providers and their business associates.
FDA regulations for pharmaceuticals focus on ensuring drug safety and efficacy via rigorous processes like New Drug Applications (NDAs), which often involve extensive clinical trials and can take an average of 12 years.
In contrast, FDA oversight for medical devices is risk-based, with devices classified into Class I (low risk, general controls), Class II (moderate risk, often requiring 510(k) clearance for substantial equivalence to existing devices), and Class III (high risk, needing Premarket Approval with clinical data), typically spanning 3-7 years.
SOC2, developed by the AICPA, is an attestation framework for service organizations, evaluating controls across trust services criteria (security, availability, etc.) via Type 1 (design) or Type 2 (operational effectiveness) reports, and it’s not industry-specific or legally mandated like HIPAA.
HITRUST is a voluntary, certifiable framework tailored to healthcare, incorporating HIPAA along with standards like NIST and ISO for comprehensive risk management, going beyond HIPAA’s requirements with prescriptive controls and global applicability.

For example, while FDA emphasizes product testing and recalls, HIPAA prioritizes data handling, and SOC2 focuses on organizational controls—highlighting why strategies must be customized to each framework’s unique scope and enforcement.

Common Compliance Mistakes - And How To Sidestep Them

One of the most frequent errors startups commit is treating compliance as an afterthought, leading to retrofits that drain resources and expose vulnerabilities.

As Ilsa Webeck noted, the biggest pitfall she sees is really around understanding what the compliance landscape is going to look like for your product. Startups often assume their product’s intended use and launch plan align with FDA expectations without researching requirements or establishing a quality management system (QMS). This lack of preparation can lead to costly pivots, as some clients she worked with had to change their product’s market approach after regulatory reviews revealed misaligned pathways.

In healthcare, overlooking patient privacy under HIPAA can result in breaches from unsecured devices or inadequate employee training—issues like staff snooping on patient records are top violations that have fined organizations millions.
Fintech startups often fail to address third-party risks in AI integrations, such as unvetted vendors handling sensitive data.
A critical oversight is disregarding data quality. For multi-stage data processing, this can lead to “Garbage In, Garbage Out”. It can result in flawed fraud detection or biased models that regulators scrutinize.
A typical mistake, for SOC2, is treating it as a one-time checkbox rather than an ongoing process, often skipping a readiness assessment or gap analysis.
Broader pitfalls include underestimating documentation for SOC2 audits or assuming global regs like GDPR don’t apply early on.
Other top reasons for failed audits are due to issues like poor vulnerability management or inconsistent employee onboarding/offboarding.

Non-Compliance Costs Can Be Staggering

HIPAA violations can reach $50,000 per incident, with cumulative fines up to $1.5 million annually; SOC2 lapses contribute to data breaches averaging $4.45 million; and GDPR penalties hit 4% of global revenue, starting at $20,500 for small startups but scaling to millions. Overall, non-compliance is 2.71 times more expensive than proactive programs. To avoid this, conduct an early gap analysis and prioritize compliance by design.

How Do You Balance Moving Fast with Staying Compliant?

Startups thrive on velocity, but rushing without guardrails invites failure. General best practices:

Adopt a phased compliance approach, like starting with SOC2 Type 1 for quick foundational controls before Type 2.
Aim for HIPAA-ready cloud services (e.g., AWS with Business Associate Agreements) to prototype compliantly.

“For the early traction, you need to work with the hospitals”, Marjan says. “So how we did that is we make sure all the first prototype equipment that we are providing is off-the-shelf but is HIPAA compliant itself. For example, for data collection, we used a device that was already on the list of HIPAA-compliant units.”

Use automation tools like Vanta or Drata for cloud compliance monitoring (10-15 similar vendors available). These tools help avoid duplicate effort between HIPAA, SOC2, HITRUST and GDPR, and have relationships with familiar auditors.
Integrate “shift-left security” into CI/CD pipelines for real-time compliance scans, ensuring innovations roll out without rework.
Designate a compliance point person to handle audits without halting engineering.

And bring external expertise early on:

“I talk a lot to clinicians and patients about new technology just when they’re even kind of napkin sketches or little graphic representations,” says Ilsa Webek.

“That’s where you get feedback from the potential users about how this is actually going to work in the real world. You can adjust again to understand what could be commercially viable and then reflect back on the regulatory pathway that you need to be successful in moving it down the line. Get many advisors in, and feedback in from all of your main stakeholders. “

Innovating Within Strict Rules: Turning Constraints into Opportunities

While AI offers significant innovation potential in the highly regulated healthcare industry, startups should avoid superficially integrating AI just because it’s trendy.

Understand the regulatory frameworks first.
Also, study existing FDA-approved AI applications.
AI solutions must address genuine unmet needs and be tailored to specific problems, rather than being forced into products without clear purpose.

The AI revolution has dramatically accelerated development. With AI editors like Claude Code and Cursor, projects that used to take three months now take a week or half a week. Code review with AI and enhanced development workflows make innovation much faster for engineers.

“There’s huge wave of innovation coming and I don’t know how it’s going to impact regulations as a framework, but for sure it’s going to be impact it”, says Andrei. “The companies that build software stacks to help with regulatory evaluation already embed AI in the document processing. It’s now simpler to see of whether you’re on track with your compliance.”

“For the DocRouter.AI”, says Andrei, “being a horizontal platform means that you can have customers with different security requirements. Supply Chain requirements are going to be different than for Insurance or Medical or Legal. You can make a lot of progress with customers with lesser requirements. Then, the feature you develop is portable to other domains.”

“It’s a game of figuring out how to work with customers sharing their data, and how to find the right product pilots, so that when you go to a strictly regulated environment with HIPAA data, for example, then you can make quick progress.”

Tips for enabling innovation:

For HIPAA, use modular architectures where sensitive PHI is isolated in compliant modules, allowing rapid iteration on non-sensitive features.
Adopt privacy-by-design principles, like anonymizing data for external AI testing, and leverage compliant cloud services (e.g., AWS with BAAs) to experiment securely.
For SOC2, focus on relevant Trust Services Criteria (e.g., Security and Privacy for data-heavy innovations) to avoid over-scoping.

Recommended Tools:

Snyk or Dependabot: Scan for vulnerabilities in code dependencies.
Trivy or Aqua Security: Check container images for misconfigurations.
AWS Config or Azure Policy: Enforce compliant cloud configurations (e.g., HIPAA-eligible services). Implement infrastructure scanning.
Write unit tests to confirm API endpoints enforce role-based access control (RBAC).
Ensure all automated checks generate logs for audit trails (critical for HIPAA and SOC 2).
Store results in a centralized compliance dashboard (e.g., Vanta) for easy review during audits.

Best Practices for Healthcare Documentation

Establish Robust Communication with Healthcare Clients
- Engage Early with Hospitals: Initiate and maintain open, consistent communication with hospital clients to understand their unique, often non-public internal regulations. This is critical during both preclinical and clinical phases to align your product or service with their specific requirements.
- Engage Hospital Ethical Committees: Prior to formal collaboration, ensure discussions with the hospital’s ethical committee are part of the process. Submitting proposals to these committees is a mandatory step post-contract, as their approval is essential to proceed. Be prepared for potential rejections and plan accordingly.
Tailor Documentation to Hospital Needs
- Customize Documentation: Recognize that documentation requirements vary across hospitals. Avoid assumptions about universally accepted formats (e.g., PDFs or specific applications). Actively consult with each hospital to determine their preferred reporting formats and classifications to ensure compliance and usability.
- Leverage Technology for Compliance: For products like those using thermal temperature mapping to detect early signs of pressure injuries (e.g., ischemia or blood flow disruption), integrate AI-driven, physics-informed solutions to enhance accuracy beyond traditional methods. Ensure all patient data logging includes visual components to meet hospital documentation standards.
Resource Management for Startups
- Optimize Limited Resources: As a startup with constrained resources, prioritize partnerships with hospitals where your product or service can be delivered effectively within your capabilities. This strategic focus mitigates risks associated with overextension.
- Proactive Inquiry: Regularly ask hospitals for feedback on documentation and reporting needs to avoid costly missteps. Tailoring solutions to specific hospital requirements enhances compliance and strengthens client relationships.

Best Practices for Engineering Documentation

Leverage modern AI tools: Claude Code and Cursor have significantly improved. Use them to streamline the engineering documentation process.
Prioritize Design Phase: Use these tools to first create BRDs and PRDs, then Architecture and System Docs. Creating unit tests ahead of code implementation allows AI agents to iterate efficiently.

When to Bring in Outside Help (vs. In-House Management)

Bootstrap until you can’t — then scale expertise. Bring in external help for HIPAA when your startup lacks internal expertise. Consultants are ideal for mapping controls. Outsource when scaling rapidly or facing overlapping frameworks (e.g., HIPAA + SOC2).

Keep it in-house for maintenance once policies are set, using a dedicated officer. Engage outsiders early for readiness assessments ($10,000–$15,000 for HIPAA mocks) if pre-seed, but only for audits (mandatory external for SOC2 Type 2). Costs vary: SOC2 full compliance runs $15,000–$50,000 in 2025, HIPAA certification $40,000+ for complex setups. For global operations, hire fractional experts versed in GDPR overlaps to align frameworks efficiently, preventing siloed efforts.

Designing Tech for Evolving Rules

Regulations shift — your stack must adapt. Design tech for evolving regulations by adopting modular, scalable architectures. Build with compliant platforms (e.g., HIPAA-eligible AWS services). For SOC2, focus on flexible controls and ‘shift-left’ integration: embed compliance checks in DevOps pipelines.

Use AWS Bedrock for compliant LLMs and conduct quarterly control mappings. In 2025, anticipate AI ethics updates—implement audit trails for traceability. Multi-framework tools like Drata unify HIPAA, SOC2, and GDPR monitoring, reducing adaptation costs. Regular third-party pentests keep you ahead of cyber evolutions.

Talking Risk and Compliance with Investors: Building Credibility

Investors probe compliance to gauge risks — be ready. Focus on understanding the highest risk areas and putting the best plan in place. Show progress as you fill in knowledge gaps. Always include a slide about your compliance with crystal clear understanding and explanation based on all the regulations why you fall into this category and the ROI.

Quantify costs (e.g., validation budgets) and mitigations. Pitch to sector-savvy VCs to skip basics. Highlight how compliance enables expansion—e.g., SOC2 certification boosts trust, closing deals 2x faster.

One Tip for Your First Regulatory Submission

Facing FDA or HIPAA? Start informed. For complicated problems—look aside and see how others do it. Research vendor SOC2 selection guidance available through industry resources. Understand your pathway and have a clear description of your intent of use.

Know your FDA class early via 513(g) requests; document thoroughly (treat it as strategic communication); collaborate via Pre-Sub meetings. For HIPAA, assess applicability first—many apps aren’t covered. Aim narrow for initial submissions to iterate later, avoiding common failures like incomplete indications.

Why Compliance Pays Off: The Bigger Picture

Non-compliance isn’t just risky—it’s ruinous, with breaches costing tens and hundreds of thousands of dollars for unprepared firms. Yet, proactive startups see ROI: Faster market entry, investor confidence, and scalable innovation. As regs like AI guidelines tighten, embed compliance now. Resources like FDA’s “breakthrough” designated programs lower barriers. For more, revisit the panel recording or consult experts—your guardrails today fuel tomorrow’s breakthroughs.

Insights blend panel discussions from September 9, 2025, with industry best practices. Always seek tailored legal advice.

Build Your Company Website with Analytiq Pages

2025-08-30T00:00:00+00:00

Setting up a professional company website shouldn’t require a team of developers or expensive hosting solutions. Analytiq Pages is our streamlined approach to building beautiful, fast company websites using GitHub Pages, Jekyll, and Tailwind CSS - completely free and with enterprise-grade reliability.

📢 Join the Analytiq Pages discussion on LinkedIn

Why Analytiq Pages?

After building numerous company websites and helping startups establish their web presence, we’ve distilled the best practices into a reproducible system that delivers:

Zero hosting costs with GitHub Pages
Professional design with Tailwind CSS
Easy content creation using Markdown
Git sandbox edited with Claude Code, Cursor to create content and visualizations
Fast deployment with git-based workflows
Enterprise reliability backed by GitHub’s infrastructure

The Analytiq Pages Stack

GitHub Pages

Free, reliable hosting with enterprise-grade infrastructure

Jekyll

Write content in simple Markdown

Tailwind CSS

Embed HTML+Tailwind directly in Markdown for sophisticated layouts

Setting Up Your Company Website

The fastest way to get started is by using our reference implementation at analytiqhub.com as your starting point. This site demonstrates all the Analytiq Pages features in action and serves as a complete template for your company website. You’ll clone this proven foundation and customize it with your branding, content, and specific requirements.

Let’s walk through creating a professional company website using the Analytiq Pages approach.

Step 1: Repository Setup

Start by forking the Analytiq Pages reference implementation:

Fork the repository: Go to https://github.com/analytiq-hub/analytiq-hub.github.io and click “Fork”
Rename your fork: In your forked repository settings, rename it to your-company.github.io
Clone for local development:

# Clone your forked repository
git clone https://github.com/your-company/your-company.github.io.git
cd your-company.github.io

Step 2: Customize for Your Company

Replace all Analytiq Hub references with your company information:

Key files to update:

_config.yml - Site configuration:

title: Your Company Name
email: contact@yourcompany.com
description: Your company description
baseurl: ""
url: "http://your-company.github.io"
github_username: yourcompany

index.md - Homepage content
about.md - Company information
_posts/ - Replace sample blog posts with your content
assets/images/ - Replace logos and images
CNAME - Update with your custom domain (if using one)

GitHub Pages setup:

Go to your repository Settings → Pages
Set source to “Deploy from a branch” → main
Add custom domain if you have one

The forked template already includes Tailwind CSS configuration, so your site is ready to run locally with make dev or deploy immediately to GitHub Pages.

Step 3: Local Development Setup

Install local development prerequisites and start developing locally:

# Install dependencies
make install

# Start development server
make dev

For complete setup instructions including Ruby, Bundler, and troubleshooting, see the full Local Development Setup guide.

Point your web browser to the local development server at http://localhost:4000.

After editing files in the sandbox, manually or with Claude Code, Cursor or the preferred AI editor, they are automatically refreshed on the development server.

Changes to the menu or the footing require a restart of the local development server.
When later the github pages pipeline is setup, a push of local changes will trigger the web site update at https://your-company.github.io

Step 4: Essential Company Pages

Update the core pages with your company information:

index.md - Homepage with compelling hero section and company value proposition
about.md - Company story, team information, and mission
contact.md - Contact information and inquiry forms

Step 5: Header and Footer Customization

Start by configuring your site navigation in _config.yml:

# Header navigation menu
header_pages:
  - title: "Services"
    url: "#"
    children:
      - title: "Consulting"
        url: "/consulting"
      - title: "Development"
        url: "/development"
  - title: "Case Studies"
    url: "/case-studies"
  - title: "Blog"
    url: "/blog"
  - title: "About"
    url: "/about"
  - title: "Contact"
    url: "/contact"
    button_style: "solid"

# Footer sitemap
site_map:
  - title: "Services"
    links:
      - title: "Consulting"
        url: "/consulting"
      - title: "Case Studies"
        url: "/case-studies"
  - title: "Company"
    links:
      - title: "About"
        url: "/about"
      - title: "Contact"
        url: "/contact"

Then customize the visual elements:

_includes/custom-header.html - Company logo, announcements, search
_includes/custom-footer.html - Contact info, social links, legal pages

Step 6: Blog Setup

Jekyll’s blog functionality is ready out of the box:

Create posts in _posts/ using the format: YYYY-MM-DD-post-title.md
Posts automatically appear on your homepage and /blog page
Use front matter to set title, author, categories, and featured images

Step 7: Case Studies

Showcase your work with detailed case studies:

Add case studies to the _case_studies/ collection
Use the case study template for consistent formatting
Include client results, project images, and key outcomes

Step 8: (Advanced) Custom Components

The template includes Tailwind-powered components for:

Call-to-action sections
Team member profiles
Service feature cards
Client testimonials

Your AI editor can help create custom components by combining Jekyll’s liquid templating with Tailwind’s utility classes.

Step 9: (Advanced) Custom Layouts

Create specialized page layouts by extending the base templates:

_layouts/landing.html - For marketing campaigns and product launches
_layouts/portfolio.html - Showcase projects with image galleries
_layouts/team.html - Team member profiles with bios and photos

Copy existing layouts as starting points and modify with your specific content structure and Tailwind styling.

Deployment and Domain Setup

GitHub Pages Configuration

Enable GitHub Pages in repository settings
Set source to “Deploy from a branch” → main
Custom domain: Add your company domain in settings

Domain Configuration

DNS Setup:

# For apex domain (company.com)
A record: 185.199.108.153
A record: 185.199.109.153
A record: 185.199.110.153
A record: 185.199.111.153

# For www subdomain
CNAME record: your-company.github.io

CNAME File:

echo "your-company.com" > CNAME

Analytiq Pages Best Practices

Content Strategy

Homepage: Clear value proposition and call-to-action
About: Company story and team credibility
Services/Products: Detailed offering descriptions
Blog: Regular insights to build authority
Contact: Multiple ways to reach you

SEO Foundation

Meta descriptions: Add to each page’s front matter
Google Analytics: Add tracking code to _includes/custom-head.html

Maintenance and Updates

The beauty of Analytiq Pages is its simplicity:

# Update content
git add .
git commit -m "Update company blog post"
git push origin main
# Site updates automatically within minutes

Why Companies Choose Analytiq Pages

Startups: Get online fast without burning budget on hosting or developers
Agencies: Deliver professional sites quickly for clients
Enterprise: Maintain security and compliance with git-based workflows
Content Teams: Edit in Markdown without technical dependencies

Conclusion

Analytiq Pages combines the best of modern web development - GitHub’s reliability, Markdown simplicity, and Tailwind’s design power - into a streamlined system perfect for company websites. Whether you’re launching a startup or refreshing an enterprise web presence, this approach delivers professional results without the complexity.

Ready to build your company website? Check out our Analytiq Pages starter template or contact us for custom implementation support.

Want to see Analytiq Pages in action? This very website was built using these exact techniques. View the source code to see how we implement our own recommendations.

From Jekyll Minima to Tailwind: A Seamless Migration Story

2025-08-20T00:00:00+00:00

I can’t believe it. Claude Code was able to update my Jekyll-based site bitdribble.github.io to use Tailwind pretty much with no intervention. The transformation from the old, less flexible Minima theme to a modern Tailwind-powered setup was vibe coded with a few light touches.

The Challenge with Minima

For years, I’ve been running my personal knowledge repository on Jekyll with the default Minima theme. While Minima served its purpose, it had several limitations. The look and feel was outdated, and layouts were especially rigid.

Enter Tailwind CSS

Tailwind CSS has become the go-to utility-first CSS framework for modern web development, and for good reason:

It is much simpler than CSS
It is responsove out of the box, built with mobile-first principles
AI editors like Claude Code and Cursor are very fluent with Tailwind.

The Migration Process

What surprised me most was how seamless the migration turned out to be.

Development Environment Setup

The migration was performed on Fedora Linux using a simple but effective workflow:

Repository Setup: Checked out the Git repository from the command line:

git clone https://github.com/bitdribble/bitdribble.github.io.git
cd bitdribble.github.io

IDE Integration: Loaded the project in Cursor (this works equally well in VSCode):
Claude Code Extension: Enabled the Claude Code add-in, which, like Cursor, provides:
- Intelligent code suggestions and refactoring
- Context-aware assistance with Jekyll and Tailwind
- Seamless understanding of project structure and dependencies

How Claude Code Transformed the Migration

The AI assistant proved exceptionally capable at understanding both Jekyll’s architecture and Tailwind’s utility-first approach. Here’s what made it work so well:

1. Jekyll’s Markdown Flexibility

Almost all Jekyll pages can be written in pure Markdown. Take a look at my markdown example page - it’s a full demonstration of how Jekyll processes Markdown content beautifully, even with the new Tailwind styling.

Content creators don’t need to know HTML or CSS
Blog posts remain simple and focused on content

2. Inline HTML When Needed

When you need more sophisticated layouts or Tailwind-specific components, Jekyll’s Markdown processor allows you to embed HTML+Tailwind directly in your Markdown files. For example, I was able to create a sophisticated three-column layout for my about page by simply adding:

 class="bg-white rounded-lg shadow-lg p-8">
   class="grid md:grid-cols-3 gap-8 items-start">

Live Example: Here’s that same three-column layout in action with actual content:

Markdown Content

Write content in simple Markdown syntax without worrying about styling complexities.

Tailwind Styling

Add sophisticated layouts with utility-first CSS classes when you need more control.

Jekyll Processing

Jekyll seamlessly processes both Markdown and HTML, giving you complete flexibility.

This flexibility gives you the best of both worlds - simple content editing in Markdown, with the power to create complex layouts when needed.

3. Enhanced Features with Jekyll

The Jekyll + Tailwind combination also preserves and enhances Jekyll’s powerful features. Take a look at the markdown example page which demonstrates:

Code Syntax Highlighting: Jekyll automatically highlights code blocks with proper syntax coloring for multiple languages:

def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

MathJax Integration: Mathematical formulas can be enabled by adding mathjax: true to the page front matter:

---
layout: page
title: Your Page Title
mathjax: true
---

Then you can write beautiful mathematical expressions:

Inline math: $E = mc^2$
Display equations: $\int_{-\infty}^{\infty} e^{-x^2} dx = \sqrt{\pi}$

Page Headers: Jekyll’s front matter system makes it easy to configure pages with metadata, layout selection, and feature toggles like MathJax - all while maintaining the flexibility to use Tailwind styling throughout the content.

The Payoff

The transformation has been dramatic. Here’s a side-by-side comparison of the old Minima theme versus the new Tailwind-powered design:

Before: Jekyll Minima Theme

The original site using Jekyll's Minima theme with dark styling

After: Tailwind CSS Design

The modern site with Tailwind CSS featuring improved navigation and visual hierarchy

The site now boasts:

Modern visuals: Professional and polished
Performance: Purged CSS for leaner loads
Flexibility: I can create any layout I can imagine without fighting the framework

Why Jekyll + Tailwind Rocks

Content Teams: Edit in Markdown, no CSS needed.
Developers: Modern tooling, reusable components
Businesses: Fast, SEO-friendly, low-cost hosting (GitHub Pages, Netlify)

Technical Implementation

The migration involved several key technical steps that Claude Code helped orchestrate seamlessly:

1. Tailwind CSS Integration

Adding the Tailwind CLI:

# Downloaded the standalone Tailwind CLI binary
curl -sLO https://github.com/tailwindlabs/tailwindcss/releases/latest/download/tailwindcss-linux-x64
chmod +x tailwindcss-linux-x64
mv tailwindcss-linux-x64 tailwindcss

Created Tailwind Configuration:

// tailwind.config.js
module.exports = {
  content: [
    './_includes/**/*.html',
    './_layouts/**/*.html', 
    './_posts/**/*.md',
    './*.html',
    './*.md',
    './**/*.md'
  ],
  theme: {
    extend: {
      // Custom colors and styling
    },
  },
  plugins: [],
}

Build Process Integration: Created a Makefile to streamline development with proper process management: make dev starts the web server locally, and Ctrl-C stops it:

# Development with live reload and signal handling
dev:
	@echo "Starting development environment..."
	@trap 'echo "Stopping all processes..."; kill 0' INT; \
	./tailwindcss -o assets/css/tailwind.css --watch & \
	TAILWIND_PID=$$!; \
	bundle exec jekyll serve & \
	JEKYLL_PID=$$!; \
	echo "Development server running. Press Ctrl+C to stop both processes."; \
	wait

# Production build
build:
	./tailwindcss -o assets/css/tailwind.css --minify
	bundle exec jekyll build

2. Replacing the Minima Theme

Removed Theme Dependency:

# _config.yml - Commented out the minima theme
# theme: minima  # Removed - using Tailwind CSS instead

Layout System Redesign:

_layouts/default.html: Created a new base layout with Tailwind styling
_layouts/home.html: Redesigned blog listing with card-based design
_layouts/page.html: Clean page layout with proper typography
_layouts/post.html: Enhanced blog post layout with better readability

3. Component Migration Strategy

Navigation System: Replaced Minima’s navigation with a modern Tailwind-based header featuring:

Responsive dropdown menus
Clean typography and spacing

4. Content Preservation

Front Matter Compatibility: All existing blog posts and pages continued to work without modification. Jekyll’s front matter system remained unchanged:

---
layout: post
title: "My Blog Post"
date: 2025-01-20
categories: [webdev, jekyll]
---

Conclusion

Jekyll + Tailwind is a powerhouse for blogs and company sites, blending content simplicity with design flexibility. Claude Code nailed the migration, delivering a stunning result with minimal effort. Check the source code or live site to see it in action!

DocRouter.AI: Adventures in CSS and AI Coding

2025-07-29T00:00:00+00:00

DocRouter.AI transforms messy, multi-layout business documents into clean, structured data using large language models (LLMs) and schema-driven orchestration. We focus on regulated industries like insurance, healthcare, and supply chain, where precision is non-negotiable. It’s a horizontal data layer application that plugs into vertical-specific apps, acting as an AI accelerator across sectors.

DocRouter.AI User Experience

We’re developing DocRouter.AI as open source because experience shows it leads to better-designed, more resilient code over time. Business value comes from our SaaS version at app.docrouter.ai and enterprise on-VPC installations. We also offer consulting to help other companies build software using similar styles and tools.

GitHub Repository: github.com/analytiq-hub/doc-router

A revolution is underway in software development with tools like Cursor and Claude Code. I’ll dive into how we built DocRouter.AI, sharing lessons that could apply to your projects. (I’ve detailed my tool usage in a previous LinkedIn post, so I won’t repeat it here.)

My Background: From Embedded Systems to AI-Driven Front Ends

I come from a world of cloud, back-end, and embedded work, starting long ago with Linux kernel programming, computer networks, then high performance computing for Wall St, followed by robotics/ROS/computer vision.

In short, my expertise is embedded back-end, and data science. When starting DocRouter.AI, I had zero hands-on experience with front-end development— no JavaScript, TypeScript, or React.

But tools like GitHub Copilot (which helped me learn Apache Spark and Terraform from scratch in prior roles) paved the way.

By fall of last year, Cursor enabled me to build DocRouter.AI with a Next.js front end, FastAPI back end, and AWS via Terraform. Claude Code exploded about a month ago, and I’ve began using it intensively.

Lessons from AI Coding Tools: What Works and What Doesn’t

We often hear engineers claiming they can “zero-shot vibe code” entire apps from scratch. That’s possible for simple, UI-focused apps, but for anything complex, the human engineer must stay in the loop. Knowing when to let AI take over and when to intervene is becoming a black art.

For DocRouter.AI, we use FastAPI on the back end, with all functions in a single file (main.py) to simplify AI editing—easier for pattern searching and consistency. If designing for humans, I’d split it into 8-10 files. But, for AI editors, a single large file simplifies things.

AI agents have “personalities.” Claude Code Agent outperforms Cursor Agent right now. I use Cursor in manual mode, attaching specific files. Keeping FastAPI in one file reduces attachment hassle.

AI editors love “improving” code inconsistently. Most of main.py, our FastAPI interface, was built with Cursor. But for web form support (more on why below), I tasked Claude Code with creating similar FastAPI endpoints (create, list, get, update, delete) to our existing schema ones.

Claude did great but switched to identifying things with UUIDs instead of our MongoDB _id scheme. I almost did not notice it. I caught it by asking Claude Code to explain in practice how the new APIs are used.

Why Schemas, Web Forms, and MongoDB?

I’m a fan of the simplest tool for the job. DocRouter.AI stores documents, LLM prompts, and extraction schemas—MongoDB collections are ideal. Mongo handles blobs via GridFS, avoiding external storage like S3 (which Postgres would require).

Schemas: LLMs can output free-form or structured JSON. Providers support JSON schemas for precise LLM output. For docs, we classify types (e.g., via LLM prompt) then extract a specific schema—like date, patient name, DOB, address, diagnostics, doctor details in a medical prescription.

DocRouter.AI lets users configure extraction schemas for all doc types.

Web Forms: Schemas are too abstract for end users accustomed to ERPs (e.g., Epic for EHR, Salesforce for CRM). End users are instead familiar with entering data into web forms tailored to their process.

We had to add web form support to DocRouter.AI: LLM extractions map to pre-populated form fields. LLM-as-judge scores confidence, so users focus on low-confidence fields—cutting effort by up to 90%.

To stay horizontal (not vertical-specific), we avoid hard-coding web forms. Instead, we integrate a schema builder, and a web form builder.

The web form builder was pretty complicated.

But the schema builder was vibe-coded in Cursor (manual mode, incremental development: first do the FastAPI back end, then the Axios APIs, then the UI components, and the unit testers). It worked out of the box!

For comparison, Base44 solo entrepreneur Maor Shlomo (podcast interview) uses MongoDB too, seeing it as AI-era friendly over SQL. He uses plain React and JavaScript; I use Next.js and TypeScript. His simpler stack might ease things—worth pondering.

Note that Next.js has server-side capabilities, but we use FastAPI for most back-end functions.

DocRouter.AI Document Processing

The Need for Human-in-the-Loop

LLMs are precise with context, but humans must correct rare errors.

The goal is to make human reviews simple - and focus them on likely mistakes.

Ease of adoption is also key. We must adapt to existing workflows, — and plug DocRouter gradually in the customer process.

Our multi-vertical approach gets many more use cases compared to a vertical approach - but demands flexibility and portable software design.

How DocRouter.AI Works

Here is an example DocRouter.AI use from one of our pilots with an insurance company:

We process Acord forms for personal/commercial insurance: We extract insured name/address, insurance type, and can extract coverage limits, loss runs, etc.

Files arrive as email attachments; tools like n8n upload via REST APIs to DocRouter workspace.
We configure the extraction formats (e.g., insured details) and prompts (descriptions, examples, counter-examples).
These are stored as JSON schemas
We upload representative datasets of documents, and iterate schema & prompt design, as well as vendor-agnostic LLM choice, for best accuracy/cost.
We monitor accuracy, and tweak prompts (add examples/counter-examples) for edge cases.

Extracted data can be corrected by the human-in-the-loop, and is available via REST APIs for ERP upload (e.g., TMS for insurance, Epic for hospitals).

The User Perspective: Streamlining Reviews

Before DocRouter, users had to manually enter PDF data into ERP web/UI forms—a laborious process!

DocRouter pre-fills 90%+ fields with LLM extractions and, when configured, with confidence scores from LLM-as-Judge. Users then focus on low-confidence fields, ensuring perfect accuracy with minimal effort.

The human-in-the-loop is critical in regulated fields (healthcare, insurance, fintech, supply chain, legal) where errors in amounts, inventories, or obligations are unacceptable.

Achieving Accuracy and Flexibility

But direct LLM review is impractical. Users prefer familiar web forms.

We are designing the system so project managers can configure tailored web forms linked to extractions and confidences. For each doc, users review pre-populated forms flagged by confidence metrics.

Implementing the Web Form Builder: CSS Adventures

Vibe coding couldn’t handle this yet—AI tools aren’t ready for complex builders.

FormIO Builder in DocRouter.AI

We integrated FormIO (nice UI npm package). But issues arose:

FormIO elements didn’t display right. Cursor/Claude couldn’t fix immediately.
Root cause: FormIO uses Bootstrap; DocRouter uses Tailwind. Global CSS conflicts.

Added @tsed/react-formio (React support) and @tsed/tailwind-formio (Bootstrap-Tailwind fix). Forms showed, but Tailwind broke—AI couldn’t diagnose.

Expert friends suggested including FormIO components through iframes or a shadow DOM.

Shadow DOM: Claude iterated but drag-and-drop failed across the boundary with the thin DOM (FormIO incompatibility).
iframes: Overkill with API messaging for config/state.

Stuck, I fiddled with CSS via browser inspector. Solution: Redefine Tailwind breakpoints last in global.css order:

// FormIO + Bootstrap
@import 'formiojs/dist/formio.full.min.css';

// Tailwind
@tailwind base; 
@tailwind components;
@tailwind utilities;

// Correction so Bootstrap works in FormIO
@import "~@tsed/tailwind-formio/styles/index.css"; 

// Correction to correction 
// so Tailwind responsiveness works
@import './formio-custom.css'; 

Key Takeaway: AI Tools Are Game-Changers, But Humans Are Essential

The programmer must guide AI—tools are amazing, but tough problems (like CSS conflicts) need human expertise. We’re not fully autonomous coding yet, especially for intricate integrations.

If you’re building AI-accelerated tools or facing similar challenges, let’s connect! What are your experiences with Cursor, Claude Code, or CSS headaches? Share in the comments.

Claude Code vs Cursor, July ‘25

2025-07-26T00:00:00+00:00

Today, I ran Anthropic Claude Code side by side with Cursor. Here is my take:

👉 Cursor operates within its own VSCode-like editor, while Claude Code functions as a command line tool.
👉 With Cursor, I find it convenient to paste UI snapshots, especially when coding in React.
👉 Claude Code allows for pasting screenshot images into the shell on Mac, but this feature is not available on Linux terminals.
👉 The agentic flow in Claude Code outperforms that of Cursor. The latter often gets stuck in a loop.
👉 The diff application in Cursor is superior to the text interface in Claude Code. I can easily amend diffs in Cursor. In Claude Code, that is not possible.
👉 Claude Code excels in locating relevant files within the repository, including those in node modules that are not checked in.
👉 However, Claude Code searches the sandbox each time, whereas Cursor builds an index for efficiency. Cursor is thus faster, and also pretty accurate.
👉 I rarely rely on the Cursor file index. Instead of letting Cursor find the files it needs, I always pass them myself. It’s a lot more accurate.
👉 Cursor simplifies manual file attachment to the context compared to Claude Code.
👉 While Cursor automatically attaches modified files in subsequent steps, Claude Code requires manual re-entry or dynamic file detection.
👉 Both tools require active user involvement and problem-solving rather than just running as agents in the background.

In a recent coding task involving FormIO integration into DocRouter, Claude Code successfully resolved two challenging issues that Cursor couldn’t tackle, and vice versa for another problem.

Our platform, DocRouter.AI, utilizes Tailwind, while FormIO relies on Bootstrap, posing compatibility challenges.

Transitioning FormIO to Tailwind is complex, presenting difficulties for both Claude Code and Cursor.

More about Claude Code

👉 At the AWS Summit in NYC I attended, Anthropic demonstrated Claude Code, but did not dwell on Claude Code agents – presumably, b/c they are not quite Enterprise grade.
👉 But Anthropic uses Claude code agents internally, for sure.
👉 They use Claude code to develop Claude code.
👉 Claude code is implemented in Typescript directly, and installs as an npm package.

Conclusion

👉 Each tool is better in different areas. At this point, I use both together – Cursor when I can, and Claude Code when Cursor can’t solve it.
👉 Claude Code could enhance its functionality with editor integration. It has a VSCode add-on, but that is still in its early stages.

Background Jobs for FastAPI

2024-10-19T00:00:00+00:00

I needed my FastAPI backend to spawn background jobs, for example, to run Optical Character Recognition (OCR), Named Entity Recognition (NER), or Large Language Model (LLM) orchestration. The FastAPI ran as an API service to a NextJS frontend React application.

The details of the frontend don’t matter here. But, importantly, the system uses a MongoDB database backend.

My question was – what were the options available, in this case, for architecting the background jobs?

These jobs could take anywhere between a few seconds and a minute to complete.
Load varies. Jobs can stay idle a while, then ramp up and have to handle load at scale, sometimes with hundreds of requests in parallel.

Queues implemented on top of MongoDB

My plan was to implement a queue system in MongoDB, so I can post requests for background work to the queue from FastAPI, when a REST API is called. Whether I used a background thread, or background process, the plan was to read the work request from the queue, and process it.

Two options became available, and the purpose of this post is to describe them in a bit of detail:

Background handled as Coroutines in the FastAPI. This should handle the required scale when, later, I will need to distribute the FastAPI across multiple processes, deployed to ECS or Kubernetes.
Separate background process. This should support multiple processes scaled up and down as part of a process pool.

Option 1: Background Jobs as Coroutines in FastAPI

This approach involves using asynchronous programming within your FastAPI application to handle background tasks. Here are the key points:

Use FastAPI’s BackgroundTasks feature to queue jobs.
Implement a worker coroutine that continuously checks the MongoDB queue for new jobs.
Use asyncio to manage concurrent execution of background tasks.

Pros:

Simpler setup, as it’s integrated within your FastAPI application.
Easier to share resources and state with the main application.

Cons:

May not scale as well for very long-running tasks.
Could potentially impact the performance of your main API if not managed carefully.
Harder to distribute across multiple processes or machines.

Here’s a basic implementation of the FastAPI (main.py):

from fastapi import FastAPI, BackgroundTasks
from pymongo import MongoClient
import asyncio
from datetime import datetime
import uuid

app = FastAPI()
client = MongoClient('mongodb://localhost:27017/')
db = client['background_jobs']
queue_collection = db['job_queue']

class BackgroundJobProcessor:
    def __init__(self):
        self.running = False
    
    async def start_worker(self):
        self.running = True
        while self.running:
            # Find a job that's ready to be processed
            job = queue_collection.find_one_and_update(
                {'status': 'pending'},
                {'$set': {'status': 'processing', 'started_at': datetime.utcnow()}},
                return_document=True
            )
            
            if job:
                # Process the job asynchronously
                asyncio.create_task(self.process_job(job))
            else:
                # No jobs available, wait a bit
                await asyncio.sleep(1)
    
    async def process_job(self, job):
        try:
            # Simulate job processing
            await asyncio.sleep(5)  # Replace with actual job logic
            
            # Update job status to completed
            queue_collection.update_one(
                {'_id': job['_id']},
                {'$set': {'status': 'completed', 'completed_at': datetime.utcnow()}}
            )
        except Exception as e:
            # Update job status to failed
            queue_collection.update_one(
                {'_id': job['_id']},
                {'$set': {'status': 'failed', 'error': str(e)}}
            )

# Initialize the job processor
job_processor = BackgroundJobProcessor()

@app.on_event("startup")
async def startup_event():
    # Start the background worker
    asyncio.create_task(job_processor.start_worker())

@app.post("/submit-job")
async def submit_job(background_tasks: BackgroundTasks):
    job_id = str(uuid.uuid4())
    job = {
        '_id': job_id,
        'status': 'pending',
        'created_at': datetime.utcnow(),
        'data': {'message': 'Hello from background job!'}
    }
    
    queue_collection.insert_one(job)
    return {"job_id": job_id, "status": "submitted"}

@app.get("/job-status/{job_id}")
async def get_job_status(job_id: str):
    job = queue_collection.find_one({'_id': job_id})
    if job:
        return {
            "job_id": job_id,
            "status": job['status'],
            "created_at": job['created_at']
        }
    return {"error": "Job not found"}

Option 2: Separate Background Process

This approach involves creating a separate process or service to handle background jobs. Here’s how it could work:

Implement a separate Python script that acts as a worker process.
Use a robust task queue system like Celery or RQ, or implement your own using MongoDB.
The FastAPI application enqueues jobs, and the worker process(es) dequeue and process them.

Pros:

Better isolation between API and background tasks.
Easier to scale horizontally by adding more worker processes.
Can be distributed across multiple machines more easily.

Cons:

More complex setup and deployment.
Requires additional infrastructure for task queue management.

Here’s a basic implementation using a custom worker process:

FastAPI application (main.py):

from fastapi import FastAPI
from pymongo import MongoClient
from datetime import datetime
import uuid

app = FastAPI()
client = MongoClient('mongodb://localhost:27017/')
db = client['background_jobs']
queue_collection = db['job_queue']

@app.post("/submit-job")
async def submit_job():
    job_id = str(uuid.uuid4())
    job = {
        '_id': job_id,
        'status': 'pending',
        'created_at': datetime.utcnow(),
        'data': {'message': 'Hello from background job!'}
    }
    
    queue_collection.insert_one(job)
    return {"job_id": job_id, "status": "submitted"}

@app.get("/job-status/{job_id}")
async def get_job_status(job_id: str):
    job = queue_collection.find_one({'_id': job_id})
    if job:
        return {
            "job_id": job_id,
            "status": job['status'],
            "created_at": job['created_at']
        }
    return {"error": "Job not found"}

Worker process (worker.py):

from pymongo import MongoClient
from datetime import datetime
import time
import os

client = MongoClient('mongodb://localhost:27017/')
db = client['background_jobs']
queue_collection = db['job_queue']

def process_job(job):
    """Process a background job"""
    print(f"Processing job {job['_id']}")
    
    # Simulate job processing
    time.sleep(5)  # Replace with actual job logic
    
    # Update job status to completed
    queue_collection.update_one(
        {'_id': job['_id']},
        {'$set': {'status': 'completed', 'completed_at': datetime.utcnow()}}
    )
    print(f"Completed job {job['_id']}")

def main():
    print("Starting background job worker...")
    
    while True:
        try:
            # Find a job that's ready to be processed
            job = queue_collection.find_one_and_update(
                {'status': 'pending'},
                {'$set': {'status': 'processing', 'started_at': datetime.utcnow()}},
                return_document=True
            )
            
            if job:
                process_job(job)
            else:
                # No jobs available, wait a bit
                time.sleep(1)
                
        except Exception as e:
            print(f"Error processing job: {e}")
            time.sleep(5)

if __name__ == "__main__":
    main()

To run this setup, you would start your FastAPI application as usual, and then start one or more worker processes using python worker.py.

Recommended Design

Given the requirements, especially the need to distribute across multiple processes when deployed to ECS or Kubernetes, it is best to pick Option 2: Separate Background Process.

This approach will give more flexibility in scaling and managing background jobs independently of your API service. It also aligns well with containerized deployments, where we can have separate containers for API and for workers.

To implement this:

Use MongoDB queue as planned.
Implement the worker process as shown above.
When deploying, you can scale your API and worker containers independently based on load.
Consider using a process manager like Supervisor or Docker’s built-in restart policies to ensure your worker processes stay running.

This approach will allow for handling long-running tasks like Textract processing without impacting your API performance, and it will scale well as the application grows.

Multiple Processes in Option 2

In Option 2, how would I set up multiple worker processes simultaneously? Will find_one_and_update() guarantee that only one worker picks up the job?

Yes, we can definitely have multiple worker processes set up simultaneously in Option 2. This is one of the key advantages of this approach, as it allows for better scalability and parallel processing of jobs.

The find_one_and_update() MongoDB API does indeed provide a level of guarantee that only one worker will pick up a specific job. This is because the operation is atomic, meaning it’s executed as a single, indivisible unit. Here’s how it works:

The find_one_and_update() operation atomically finds a document matching the query criteria and updates it.
If multiple workers are trying to get a job at the same time, only one will successfully update the document and receive it as a result.
The others will get None (or null) as a result, indicating that no document was found matching the criteria (because it was already updated by another worker).

To set up multiple worker processes, you could:

Run multiple instances of your worker script.
Use a process manager like Supervisor to manage multiple worker processes.
In a containerized environment (like ECS or Kubernetes), deploy multiple pods running the worker script.

Here’s an example of how you might modify the worker script to allow for easy scaling:

from pymongo import MongoClient
from datetime import datetime
import time
import os

client = MongoClient('mongodb://localhost:27017/')
db = client['background_jobs']
queue_collection = db['job_queue']

def process_job(job, worker_id):
    """Process a background job"""
    print(f"Worker {worker_id} processing job {job['_id']}")
    
    # Simulate job processing
    time.sleep(5)  # Replace with actual job logic
    
    # Update job status to completed
    queue_collection.update_one(
        {'_id': job['_id']},
        {'$set': {
            'status': 'completed', 
            'completed_at': datetime.utcnow(),
            'worker_id': worker_id
        }}
    )
    print(f"Worker {worker_id} completed job {job['_id']}")

def main():
    worker_id = os.getpid()
    print(f"Starting background job worker {worker_id}...")
    
    while True:
        try:
            # Find a job that's ready to be processed
            job = queue_collection.find_one_and_update(
                {'status': 'pending'},
                {'$set': {
                    'status': 'processing', 
                    'started_at': datetime.utcnow(),
                    'worker_id': worker_id
                }},
                return_document=True
            )
            
            if job:
                process_job(job, worker_id)
            else:
                # No jobs available, wait a bit
                time.sleep(1)
                
        except Exception as e:
            print(f"Worker {worker_id} error processing job: {e}")
            time.sleep(5)

if __name__ == "__main__":
    main()

In this modified version:

We’ve added a worker identifier (using the process ID) to the job document when it’s picked up. This can help with debugging and monitoring.
The process_job function now prints which worker is processing the job.

You can run multiple instances of this script, and each will operate independently, picking up jobs as they become available. The find_one_and_update() operation ensures that each job is only processed by one worker, even when multiple workers are running simultaneously.

What if MongoDB has distributed readers?

Will find_one_and_update(), in that case, still be atomic? I am asking because MongoDB is not transactional, but an eventually consistent database.

1. Atomicity of `find_one_and_update()`:

Even in a distributed MongoDB setup, find_one_and_update() is atomic at the document level. This means that for a single document, the operation is all-or-nothing and isolated from other operations.

2. Distributed readers:

In a distributed MongoDB setup with multiple replica sets, the primary node handles all write operations. Read operations can be distributed across secondary nodes, but write operations (including find_one_and_update()) are always routed to the primary.

3. Eventual consistency:

While MongoDB is eventually consistent across replica sets, the primary node always has the most up-to-date data. Since our find_one_and_update() operation is performed on the primary, we don’t need to worry about eventual consistency for this specific operation.

However, there are some considerations:

Network partitions: In rare cases of network partitions, there might be a brief period where a secondary node is incorrectly perceived as the primary. This could potentially lead to conflicting updates.
Rollbacks: If a primary node fails before replicating a write operation to a majority of the replica set, that operation might be rolled back when a new primary is elected.

To mitigate these issues and ensure stronger consistency, you can:

Use write concern: Specify a write concern that ensures the operation is acknowledged by a majority of replica set members before considering it complete.
Use transactions: For operations that span multiple documents or collections, you can use multi-document transactions (available in MongoDB 4.0+).

Transactions, however, can only share session between the FastAPI and the background if the background is a coroutine in FastAPI (our Option 1)

Here’s how you might modify the worker function to incorporate these safeguards:

from pymongo import MongoClient
from datetime import datetime
import time
import os

client = MongoClient('mongodb://localhost:27017/')
db = client['background_jobs']
queue_collection = db['job_queue']

def process_job(job, worker_id, session):
    """Process a background job"""
    print(f"Worker {worker_id} processing job {job['_id']}")
    
    # Simulate job processing
    time.sleep(5)  # Replace with actual job logic
    
    # Update job status to completed
    queue_collection.update_one(
        {'_id': job['_id']},
        {'$set': {
            'status': 'completed', 
            'completed_at': datetime.utcnow(),
            'worker_id': worker_id
        }},
        session=session
    )
    print(f"Worker {worker_id} completed job {job['_id']}")

def main():
    worker_id = os.getpid()
    print(f"Starting background job worker {worker_id}...")
    
    while True:
        try:
            # Start a session for transaction
            with client.start_session() as session:
                with session.start_transaction():
                    # Find a job that's ready to be processed
                    job = queue_collection.find_one_and_update(
                        {'status': 'pending'},
                        {'$set': {
                            'status': 'processing', 
                            'started_at': datetime.utcnow(),
                            'worker_id': worker_id
                        }},
                        return_document=True,
                        session=session
                    )
                    
                    if job:
                        process_job(job, worker_id, session)
                        session.commit_transaction()
                    else:
                        # No jobs available, wait a bit
                        time.sleep(1)
                        
        except Exception as e:
            print(f"Worker {worker_id} error processing job: {e}")
            time.sleep(5)

if __name__ == "__main__":
    main()

Be careful to not sleep() inside the transaction! The modification does a few things:

It sets the write concern to ‘majority’, ensuring that the update is acknowledged by a majority of replica set members.
It uses a session and transaction, which provides stronger consistency guarantees for the entire job processing operation.
It passes the session to both the find_one_and_update() and process_job() functions, ensuring that all database operations within a job are part of the same transaction.

Remember to modify your process_job() function to accept and use the session:

def process_job(job, worker_id, session):
    """Process a background job with session support"""
    print(f"Worker {worker_id} processing job {job['_id']}")
    
    # Simulate job processing (outside transaction)
    time.sleep(5)  # Replace with actual job logic
    
    # Update job status to completed
    queue_collection.update_one(
        {'_id': job['_id']},
        {'$set': {
            'status': 'completed', 
            'completed_at': datetime.utcnow(),
            'worker_id': worker_id
        }},
        session=session
    )
    print(f"Worker {worker_id} completed job {job['_id']}")

These changes will provide stronger consistency guarantees in a distributed MongoDB setup, minimizing the risk of job duplication or loss due to network issues or node failures.

The job, at any rate, should be implemented to be idempotent (meaning, if the job is called twice on the same message, the 2nd call should be a no-op).

Why should the job be idempotent?

It is very rare for the same message to be read by two workers. But, even if the queue was designed to guarantee single message delivery under any circumstance, it is still a good practice to design idempotent jobs. That way:

If the sender is not idempotent, and requests the same job twice, the result is not duplicated.
If the job handler partially completes, e.g. due to a network error that causes an exception in the middle of the job – upon retry, an idempotent job should be designed to skip the steps it already completed, and only do the remaining steps.

(Code examples generated with Claude in the Cursor text editor.)

Comments? Suggestions? Alternatives?

I would appreciate your take on this!

How to make a self-driving car

2024-10-19T00:00:00+00:00

Notes on an a16z survey of self-driving cars, and slides from my AI Camp Jan 2024 Talk

The Self-Driving Landscape

Fantastic survey of self driving – how does the stack look like? What is readily available off the shelf, and what is not – from Erin Price-Wright. She talks of issues tackled in perception, localization & mapping, control, planning. Excellent stuff.

Personal Experience with Lidar Systems

I recall dealing with drift in lidar point cloud detection while turning corners at speed due to rotation of laser. …And I recall the issues synchronizing multiple lidars to get a merged point cloud!

Lidar calibration – had to be done manually, because there was no ready product available for auto-calibration…

ROS and System Integration Challenges

This was a time when I was using ROS, and I was integrating C++ and python ros nodes… Designing the ROS bag recording infrastructure… Integrating with simulation… Refactoring to simplify the coordinate frames… Dealing with latency and jitter through the ROS components…

A year into it, I moved into architecting the offline computer vision data infrastructure, which meant learning AWS, terraform, evaluating data lake vendors like Databricks and Snowflake…

Building from Scratch

Those were heady days! But, it was a lot of work from scratch, building the on-board and cloud system architecture. A lot of it should have been available as a ready-made product, ready for integration – but was, actually, not readily available.

Or if a vendor had some ready-made components, they were pretty expensive for a scrappy startup.

I suppose the self-driving car industry, while bringing a great promise, is still too small for a solid middleware ecosystem to develop. It’s a challenge, and therefore, also an opportunity…

Industry Insights

Anyhow… I enjoyed Erin Price-Wright piece tremendously!

While at HackMIT ‘24, I also enjoyed sitting on the same hackathon panel with Chris Urmson, ex-lead at Waymo, and current cofounder/CEO of Aurora. He spoke to students about his career, a recording is available here. The hackers/students had very good questions for him… Some were working at interns at a couple other self driving car companies. But, they had run into very similar issues I was running into.

My AI Camp Presentation

Back in January, I gave a talk on How to Build a Self-Driving Car - A Look at Robotics System Design. It goes into a lot of the same details in Erin’s survey, but more from an implementation angle.

I am making the slides available for the first time: How to Build a Self-Driving Car Slides. Comments would be appreciated!

Conclusion

Self driving cars are obviously a huge subject, and my presentation was through a particular viewpoint – that of a hands-on implementer of system design, both at the ROS level, and at data and ML infrastructure for the cloud.

Difficult though as it may be, it is also one of the most exciting things an engineer can do.

My notes from HackMIT ‘24

2024-09-26T00:00:00+00:00

About 120 teams of 3-5 undergrad students compete for 24 hours to build projects. This year, the tracks were Sustainability, Education, Interactive Media, and Healthcare.
A number of companies (Fetch.ai, Modal, Convex, Terra API, Clerk, InterSystems, Suno, Akamai Technologies, …) set up booths and assist hackers with documentation and infrastructure credits for their platform. These companies offer their own prizes in addition to the main HackMIT prizes.
Hackers either build an independent project, or build on top of the platforms above – assembling a business plan and building a working demo.
Mentors are available to assist with planning, and with technical issues. The companies themselves provide a lot of technical help.
The hackers themselves come from many other colleges aside from MIT. I met some brilliant young hackers from places like Purdue, Carnegie Mellon, …
Some of the visiting teams booked an apartment for the day, and worked overnight. Some stayed up the whole night to finish things up!
The projects power through the technical difficulties, and pivot as needed. Most successful projects will need a working demo, and a business plan.

What projects did I see?

An FPGA-based BF compiler (if you don’t know what the BF language is… find out on your own!)
Diffusion implemented from scratch in PyTorch
GenAI parsing classic poetry, identifying related stanzas, and combining poets together
A bike maps app that allows you to mark impassable/unbikable streets. That was a pretty impressive implementation!
An automated email parser that handles documents and fills forms for you. Could be used, for example, by landlords who need to handle paperwork from renters.
A music app connected to the Apple Watch, playing music faster when you run faster
Nerf algorithm for cat scans
AI chat assistant, with ability to order food, book appointments and reserve flights
Videos transcribed, and used to generate songs. Can be used to create customized songs for social media postings, or for marketing campaigns.
Physical therapy assistant app, that detects angle of arm motion
State-wide monitor of wildfires, using satellite maps to track vegetation, and using CV model to estimate fire risk – with interfaces for fire departments, govt, and home owners
Video app to detect/translate American Sign Languages
Nicotine modulating app using AI

Full list of projects is available on the HackMIT website.

What tech did the hackers use?

Many apps used React on the front end, FastAPI on the back end, connected to API services like Fetch.ai, Modal, TerraAPI, InterSystems,… Pipelines of varying levels of complexity were implemented. Pretty impressive work for 24 hrs!

While a small number of projects were developed with lower level tools (direct PyTorch, lidar drivers, VHDL for FPGA programming…)

Most successful projects, developed quickly, used the higher level tools.

Tech breakdown:

About three quarters of hackers used GenAI
About a quarter used computer vision
A few used Diffusion, Nerf, and more fancy algorithms
There was no robotics or self driving
Very small number tackled financial apps
All hackers used MacBooks (I did not notice Linux laptops at all)
VS Code was the editor of choice

How were the projects assessed?

Projects were rated for:

Innovation (30%)
Technical Complexity (30%)
Impact (30%)
Learning & Collaboration (10%)

Assessment criteria:

How novel or unique was the idea? Were there similar projects seen at other hackathons? This, actually, made a difference when the top 4 projects were selected.
Did the demo actually work end to end?
Was there a functional user interface? Did the team think about user experience?
Was the project technically impressive?
Does the project solve a significant problem? Can it be further extended to solve a real problem?
Did all team members collaborate? If a single person team, did the hacker do something difficult and learn something new?

How did the judging rounds work?

Round 1

A first round of judges went table to table, and spent 10 minutes with each project, hearing the presentations, asking questions, making suggestions…. rating the project through rating app, in each category (Innovation, Technical Complexity, Impact, Learning & Collaboration).

Round 2

The winners of the 1st round were judged again, in a 2nd round, by judges going table to table.

Final Round

This resulted in 12 top teams being selected for the last round. These 12 presented again, but this time in front of the Panel Judges.

The Panel Judges, each, got to see presentations from half of the 12 top teams, and got to ask questions. Each presentation took not more than 8 minutes, with 2 minutes of questions from the Judges. The process took place pretty quick.

At the end, the Judges got together, and started briefly discussing their favorite teams. Since not one judge saw more than half the teams, they had to briefly make their impressions known, so all judges could vote for the winner.

How were the winners selected?

Ultimately, the top four teams were selected, by vote. Then, after more discussion, the top winner was picked – then, the winner in each track.

I have to say, all teams were very impressive, and the difference between the winners and the almost-winners in the last round were pretty small. Technically, all teams were super sharp. They were quick to take advantage of the available tools and APIs, and to orchestrate them together into impressive working products.

HackMIT 2024 was an incredible experience showcasing the next generation of tech innovators. The level of technical sophistication and creativity demonstrated in just 24 hours was truly remarkable.