Blog

  • How I Made Bulk Ordering 1200% Faster with Multithreading!

    How I Made Bulk Ordering 1200% Faster with Multithreading!

    Table Of Contents

    Case Study

    Handling bulk orders is no joke. When I started optimizing system for a client, the bulk order process took 4 hours ⏳ to complete. That’s insane, right? Imagine running a store and waiting that long for bulk orders to process.

    But why is processing bulk orders taking so long? Well, the system handles orders for printing on all sorts of objects, and each order involves some image processing.

    Well, I decided to fix it. And after a lot of research, trial, and error, I built a multithreaded scheduler with a queueing system that keeps all CPU cores busy and efficiently processes orders when cores are full. The result? A 1200% performance boost! 🚀

    Here’s how I did it.

    🔍 The Problem: A Messy, Slow Bulk Order System

    When I first examined the existing code, I was met with 100,000+ lines of unoptimized, tangled logic written 10–12 years ago. Refactoring it meant understanding nearly every line—if not all, then at least a significant portion—which would have taken months.

    Instead of a full rewrite, I focused on fixing the core issue.

    What I Found:

    • 🌋 Thread Management Was a Disaster

      • The system had a fixed (hardcoded) number of "threads", though, to be precise, they were actually processes—because, of course, cron was just brute-force launching scripts at regular intervals. Scaling? Yeah, that wasn’t really a thing.
      • Processing was often painfully sequential, turning what should’ve been a sprint into a sluggish, unnecessary waiting game.
    • 🪈 Bulk Orders Were Stuck in a Single Pipeline

      • Each process had a fixed cap on how many products it could process—because, clearly, flexibility is overrated.
      • If the product count was low, the system ran sequentially—pretty much the worst possible way to handle speed.
      • And when CPU cores were maxed out, new jobs weren’t queued properly, because efficient queuing? Nah, who needs that?

    Clearly, I needed a better approach.

    ⚡ The Experiment

    • I ran multiple experiments to see what would work best:

      I quickly realized this was all about scheduling and processing. If I could schedule processing parallely and spawn n (CPU count) processes—each handling a separate product—that might just solve the problem, at least to some extent.

    • PHP-Based Thread Management

      • Since the all was already written in PHP, I tried to manage process directly in PHP. It was a nightmare—PHP just isn’t built for concurrency, making it an inefficient and problematic solution.
      • Yeah, this was definitely not the way forward.
    • Golang-Powered Multithreaded Scheduler

      Then, I decided to get a bit daring and offload thread management to Golang.

      • I built a lightweight Golang script that retrieves all products, spawns multiple threads to execute the order processing (php script) via shell execution, and collects responses—leveraging a fan-out, fan-in multithreading algorithm.
      • When I ran the script, the results were game-changing—processing time dropped from 4 hours to just 20 minutes! 🚀
      • 💪 The system now scales seamlessly up to a point with server resources (though, of course, PHP limits scalability beyond a certain extent).

    🎯 The Final Plan

    Seeing the massive improvement, I build a long-term solution:

    • A Golang microservice with a multithreaded scheduler to handle bulk order processing efficiently while maximizing CPU utilization.
    • A queueing system to ensure no job gets lost, even when all cores are busy.
    • Real-time status updates right on the dashboard.
    • On-demand order processing, both manual and automated, with the ability to cancel bulk processing mid-way if needed.

    Multithreaded Scheduler & Queuing System

    • I noticed multiple types of CSV processing happening in the system, all relying on the same cron-based scheduling—where timing was critical.
    • To streamline this, I extracted the scheduler and queuing logic, making it adaptable through an interface that could handle any task.
    • I then built a task factory that implements the task interface, allowing us to easily add new tasks without modifying the core logic.

    💡 Why Not Just Refactor the Old Code?

    • Refactoring 100,000+ lines could take months (or even a year), and we still wouldn’t know if the performance would improve enough.
    • Instead, we fix the core issue first (bulk order speed), then refactor the rest piece by piece alongside other improvements.

    🚀 Wrapping It Up

    • This project proves one thing: big improvements don’t always require massive rewrites.
    • Instead of refactoring blindly, I focused on one key performance bottleneck (thread management), built a multithreaded scheduler with an efficient queue, and delivered a 1200% speed boost in the process.

    What’s Next?

    • This was just a glimpse of how multithreading and queuing can drive significant performance improvements! If you’re interested in exploring the code implementation, check out the worker pool implementation here: GitHub – akash-aman/threadpool. I will also be writing a blog on its implementation soon, stay tuned 🎉.
  • Inside My Workspace: A Guided Setup Tour

    Inside My Workspace: A Guided Setup Tour

    Table of Contents

    Welcome to my workstation setup tour! As a Senior Software Developer and tech enthusiast, I’ve carefully curated a high-performance workspace that balances power, efficiency, and aesthetics. Here’s a breakdown of my setup:

    🎯 Primary Workstation

    • Chip: Apple M4 Pro/Max (12-core CPU, 40-core GPU)
    • RAM: 128GB Unified Memory
    • Storage: 2TB NVMe SSD
    • Display: 16-inch Liquid Retina XDR Nano Texture (3456 x 2234, 120Hz ProMotion, HDR, 1600 nits peak brightness)
    • Ports: 3x Thunderbolt 5 (TB5), HDMI 2.1, SDXC card slot, MagSafe 3, 3.5mm headphone jack
    • Ecosystem: AirPods, iPad Pro M4, Apple Watch Ultra, iPhone 16 Pro Max
    • Battery Life: Up to 22 hours video playback

    The MacBook Pro M4 is the heart of my setup. With Thunderbolt 5, I get 80Gbps bandwidth (up to 120Gbps dynamically), making external storage, multiple high-resolution displays, and peripherals seamless. The ProMotion display and HDR capabilities enhance productivity and media consumption.

    🎮 Gaming & Secondary Workstation

    • Processor: 12th Gen Intel® Core™ i9-12900H (14 cores: 6 P-cores, 8 E-cores, up to 5.0GHz)
    • GPU: NVIDIA® GeForce RTX™ 3070 Ti (8GB GDDR6, ROG Boost up to 120W)
    • Display: 16-inch ROG Nebula Display (QHD+ 2560×1600, 165Hz, 3ms, 100% DCI-P3, Adaptive-Sync, Dolby Vision HDR, MUX Switch + Optimus)
    • RAM: 32GB DDR5 (expandable)
    • Storage: 3TB PCIe NVMe Gen 4 SSD
    • Ports: USB-A 3.2, USB-C, HDMI 2.0b, RJ45 LAN, Thunderbolt 4
    • Connectivity: Wi-Fi 6E + Bluetooth 5.3

    I use ASUS ROG Zephyrus M16 (2022) powerhouse mainly for gaming, testing high-performance applications, and virtualization. The MUX Switch ensures direct GPU access for maximum FPS in games, while Adaptive-Sync eliminates screen tearing.

    📷 Camera & Lens

    • Camera Body: Sony α1 II (ILCE-1M2)
      • 50.1 MP Full-Frame Stacked Exmor RS CMOS Sensor
      • Blackout-free 30 fps continuous shooting
      • AI-based subject recognition AF
      • 8.5-stop in-body image stabilization
      • 8K/60p and 4K/120p video

    • Lens: Sony FE 28-70 mm F2 GM (SEL2870GM)
      • Constant F2.0 aperture for low-light and depth of field control
      • G Master optical design with XA & ED elements
      • Dual XD Linear Motors for fast, precise AF
      • De-clickable aperture ring for smooth video transitions
      • Dust and moisture resistant design

    🖥️ Dual Monitors

    • LG UltraGear 27GP850-B (x2) – 27-inch, QHD (2560×1440), 165Hz, Nano IPS, 1ms GtG, HDR10, G-Sync Compatible, FreeSync Premium
    • Monitor Arm: Jin Office Heavy-Duty Dual Monitor Stand (Gas Spring, Fully Adjustable, 15kg per arm)

    The dual 165Hz monitors provide ultra-smooth visuals for both work and gaming, with HDR support and factory-calibrated color accuracy.

    🎙️ Audio Setup

    • Mic: HyperX QuadCast S (RGB, Condenser, USB, Built-in Anti-Vibration Shock Mount, Tap-to-Mute Sensor)
    • Boom Arm: Rode PSA1+ Desk-mounted Broadcast Arm
    • Headphones: AirPods Pro (Seamless Apple Integration, Spatial Audio, Adaptive Transparency)
    • Smart Speaker: Google Home Mini (Voice Assistant, Smart Home Control)

    For calls, streaming, and recording, the QuadCast S ensures studio-quality sound, while AirPods Pro sync flawlessly with my Apple ecosystem.

    ⚡ Connectivity & Docking

    • Dock: Dell Thunderbolt Dock WD22TB4
      • Ports: 2x Thunderbolt 4, Multiple USB-A/C, Ethernet, DisplayPort
      • Resolution Support: 5K 60Hz (Single Display) / 4K 60Hz (Quad Display)
    • MacBook Thunderbolt 5 Ports allow seamless high-speed connections, supporting multiple 8K displays and external SSDs at full bandwidth.

    ⌨️ Peripherals

    • Keyboard: Logitech MX Keys S (Wireless, Backlit, Bluetooth, Multi-Device, USB-C, Quiet Typing)
    • Mouse: Logitech MX Master 3S (Ergonomic, Multi-Device, 8000 DPI, MagSpeed Scroll, Silent Clicks)

    These peripherals enhance workflow with their customizable buttons, seamless switching between devices, and ultra-smooth performance.

    🪑 Furniture & Ergonomics

    • Gaming Chair: Cybeart Apex Gaming Chair (Ergonomic, Memory Foam, Adjustable Armrests, Lumbar Support)
    • Standing Desk: Jin Office Electric Height Adjustable Desk (Dual Motor, Memory Presets, 125kg Capacity, 1500x750mm Tabletop)

    The motorized standing desk lets me switch between sitting and standing effortlessly, reducing strain during long work sessions.

    📱 Ecosystem

    • iPhone 16 Pro Max (1TB, A18 Pro Chip, Titanium Build, USB-C, 120Hz ProMotion, Always-On Display)
    • iPad Pro M4 (1TB, 13-inch Ultra Retina XDR, M4 Chip, Apple Pencil Pro Support)
    • Apple Watch Ultra (Black, Rugged Design, 49mm, Dual-Frequency GPS, 36-hour Battery Life)
    • AirPods Pro (Adaptive Noise Cancellation, Spatial Audio, Lossless Audio with Apple Vision Pro)

    The Apple ecosystem ensures everything syncs flawlessly across devices, making my workflow more efficient and enjoyable.

    This setup is designed for productivity, gaming, and seamless multitasking 🚀

  • The Portfolio 🚀

    The Portfolio 🚀

    I am excited to share the fascinating journey of developing my digital portfolio. Crafting a compelling digital portfolio involves a perfect blend of creativity and technical prowess.

    In this article, I will dive deep into the architecture & also discuss how I tackled two crucial web performance metrics, to ensure a seamless user experience for every visitor.

    Journey

    Building my portfolio was a thrilling journey filled with various challenges and exciting discoveries. I ventured into exploring different frameworks like Svelte, Astro, Gatsby, and Hugo, each offering unique possibilities for my project.

    As I set out to find the perfect framework, my main goal was to create a frontend user experience that was as smooth as possible, accompanied by outstanding performance metrics. Svelte caught my attention with its impressive performance, but I noticed a slight delay during page transitions. After some investigation, I learned that this was due to page data not loading early enough when it came into view. Although it might be a bit technical, I strongly recommend giving Svelte a try. It’s an ultra-performant magical framework that truly works wonders.

    Despite Svelte’s allure, I decided it might not be the ideal fit for my project. I sought a framework that would allow me to update static content quickly without enduring lengthy build times, which also led me to steer away from projects like Gatsby.

    Astro and Hugo presented different challenges too. Astro’s completely static build meant that page changes required a full page refresh, impacting the seamless experience I desired.

    While exploring various solutions, I considered the architecture’s long-term maintenance implications 💸👀. That’s when I came up with an 😃 idea 💡: what if I converted markdown to JSON using multithreaded code ❓ Excitedly, I attempted to write multithreaded Go Lang code, but it didn’t perform as well as I hoped, taking 10 – 16 minutes for 40K markdown records. Though amusingly, I knew I wouldn’t be writing 40K records anytime soon! 😂 Nevertheless, my pursuit of top-notch performance was relentless.

    In my quest for efficiency 🚀, I discovered that Hugo could conveniently convert markdown into JSON. So, I set up a Hugo project that performed this conversion and consumed the JSON in Svelte for server-side rendering. This approach significantly reduced build times, but there was still a slight delay in the frontend experience 😓.

    Finally, after trying out several frameworks, I decided to give Next.js a shot. I created a new app directory with fantastic features like shared layouts between routes, which enhanced the UI’s performance. These layouts preserved state, remained interactive, and avoided unnecessary re-renders. One crucial feature that impressed me greatly was On-Demand Incremental Static Regeneration (ISR), which saved me build time by generating static content on demand.

    I’ve crafted this portfolio using Next.js 13 app directory structure, combined with a Strapi backend featuring GraphQL API endpoints to manage media and content.

    One of my primary goals was to ensure a smooth user experience while maintaining the ability to update content without the hassle of rebuilding the entire project. Considering that the content is mostly static.

    I opted for a powerful feature provided by Next.js called On Demand Incremental Static Regeneration (ISR).

    This innovative technique enables the regeneration of static pages on-demand, saving valuable time and resources while ensuring that any content updates are efficiently propagated.