How to implement csv exports (millions of rows) in Laravel with a distributed approach using Queues and Jobs
Image from inspector.dev

How to implement csv exports (millions of rows) in Laravel with a distributed approach using Queues and Jobs

I was working remotely yesterday and I was supposed to implement an export feature for a list of members on a system. The resulting file must be a CSV file. There were millions of members in the database and the Laravel stream download was not working:

return response()->streamDownload($callback, 'download.csv', $headers);

Due to huge amounts of data, the download just breaks in the middle and the full data does not get returned.

I also tried to use a job for this but the job kept using so much memory and caused the export to also fail. Even though I was not storing too many variables and also setting the ones I did not need to null, it did not help the situation.

Note: PHP's garbage collector works on its own so there was no way I could force garbage collection to help me out with the memory issue.

Increasing the memory for the Laravel job was a no-no for me. It may work for some people but I decided to use a different approach which requires less memory.

So here's what I did. I created a job that re-dispatches itself to the queue until it is done exporting the data to the filesystem. The user can then download the file after this. This time round since the file was already generated, the Laravel stream downloader above worked just fine.

Let's see some code: Here is a stripped-down version of the Job file

<?php

namespace App\Jobs;

use App\Models\GeneralExport;
use Storage;


class CreateGeneralExportFileJob implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public $timeout = 1200;

    /**
     * Create a new job instance.
     *
     * @return void
     */
    public function __construct(
        private GeneralExport $export,
        private string $exportFileName,
        private int $page = 1,
    )
    {
        //
    }

    /**
     * Execute the job.
     *
     * @return void
     */
    public function handle()
    {
        $members = $this->getMembers();

        $columns = [
            'Member ID',
            'Full Name',
            'Phone Number',
            'Gender',
            'Date of Birth',
            'Email',
        ];

        $filesystemAdapter = Storage::disk('public');
        if($this->export->status === 'pending') {
            $fileName = 'general_exports/' . Carbon::now()->timestamp . '-' . $this->exportFileName . '-' . $this->export->user_id . '.csv';

            // add the headers only on the first run of this job... on subsequent runs, only append the data
            $filesystemAdapter->append($fileName, implode(',', $columns) . PHP_EOL);
        } else {
            $fileName = $this->exportFileName;
        }

        if($this->export->status !== 'processing') {
            $this->export->update([
                'status' => 'processing',
                'status_message' => "Job {$this->page} in export processing started"
            ]);
        } elseif($this->export->status === 'processing') {
            $this->export->update([
                'status_message' => "Job {$this->page} in export processing started"
            ]);
        }

        $fileResource = fopen($filesystemAdapter->path($fileName), 'a+');

        foreach ($members as $member) {
            fwrite($fileResource, implode(',', [
                    $member->id,
                    $member->user->first_name . ' ' . $member->user->last_name,
                    $member->user->phone,
                    $member->gender,
                    $member->dob,
                    $member->user->email,
                ]) . PHP_EOL);
        }

        fclose($fileResource);

        $nextPageUrl = $members->nextPageUrl();
        $nextPage = null;
        if(!is_null($nextPageUrl)) {
            $nextPage = explode('=', $nextPageUrl, 2)[1];
        }

        if(is_null($nextPage)) {
            // we are done processing
            $this->export->update([
                'status' => 'processed',
                'status_message' => 'Export file processed successfully and ready for download',
                'file' => $fileName,
            ]);
            return;
        }

        // refresh to get current state of export before using it for next job
        $this->export->refresh();

        dispatch(new static($this->export, $fileName, $nextPage));
    }


    public function getMembers() {
        return Member::paginate(10000, ['*'], 'page', $this->page)
    }

}

This snippet can be found as a gist here:

To dispatch this job, you do it this way from the controller where the download action is handled.

dispatch(new CreateGeneralExportFileJob($export, 'download.csv', 1));

I went with this approach because

  • It uses less memory
  • There is no chance of the jobs overlapping since the next job is queued only when the running job is done. I did not want to go into the nitty-gritty of having to create a non-overlapping job by using the Laravel WithoutOverlapping middleware

Hope this helps someone facing the same issue. Cheers 🍻

Leonardo Ribeiro

Software Engineer | Architect | Writer | Entrepreneur | Nodejs | Python | Laravel | Hyperf | PHP | .Net | Data Engineer | Tech Lead | SaaS 🚀

1y

wow!! amazing solution! congrats bro!

Truong Duc

Fullstack Developer at Sotatek

1y

can i see your GeneralExport file

Like
Reply
Osama Saad

Co-Founder at From Scratch | PHP & Laravel Expert | Innovator in Scalable Web Solutions | Driving Tech Leadership & Excellence

2y

Great solution, thank you for it

To view or add a comment, sign in

Others also viewed

Explore topics