Results 1 to 6 of 6

Thread: Download csv and import into database

  1. #1
    Senior Member
    Join Date
    Apr 2004
    Location
    Manchester, UK
    Posts
    168

    Default Download csv and import into database

    I need to download a large csv file (300 mb uncompressed) and import it into mysql with cron.

    Now, I had a php script that downloaded the file in chunks and savig it to the /tmp directory using:

    PHP Code:
    $handle fopen($url"rb");
    while (!
    feof($handle)) {
        
    $contents fread($handle8192);
        
    $handle2 fopen($file'a');
        
    fwrite($handle2$contents);
        
    fclose($handle2);
    }
    fclose($handle); 
    Then in I'd import it into mysql using:

    Code:
    LOAD DATA LOCAL INFILE '/tmp/filename.csv' REPLACE INTO TABLE table_name
    FIELDS TERMINATED BY ','
    OPTIONALLY ENCLOSED BY '\"'
    LINES TERMINATED BY '\n'
    IGNORE 1 LINES";
    And finally delete the file in /tmp.

    This was running fine on my test box but I got an email "CPU usage persists, continued running of this cron job will result in suspension" when I tested it on the live server.

    The last thing I want to do is harm the server performance for others, so what would be the best way to achieve this or is it not possible in a shared environment?

    Ideally the cron would run once a week (time/day unimportant).

    Thanks for any help

  2. #2
    Administrator AndrewT's Avatar
    Join Date
    Mar 2004
    Location
    Tulsa, OK
    Posts
    3,637

    Default

    The downloading via PHP might be part of the problem as it was the PHP script itself which was using a full CPU core for 3+ minutes at a time before being killed. Try simply using wget or some other alternative means to download the file to the server.

    Having said that, 300MB is still quite a chunk of data. Though running it once a week shouldn't be much of an issue.

  3. #3
    Senior Member
    Join Date
    Apr 2004
    Location
    Manchester, UK
    Posts
    168

    Default

    Thanks Andrew & apologies for any problems caused

  4. #4
    Senior Member
    Join Date
    Sep 2005
    Location
    Russia, Nizhny Novgorod
    Posts
    315

    Default

    If you want to be server-friendly, write a shell script that would spawn these processes with minimal priority, so that it does not affect other customers. Other than that, I'm pretty sure that a simple shell script will run MUCH faster than your PHP script + PHP engine + mySQL bindings anyway.

    #!/bin/bash

    nice -n 19 "wget http://blah.foo/bar/baz.quux -o /tmp/baz.csv"
    nice -n 19 "mysql -u baz_import -p baz_restricted < /tmp/baz.csv"
    nice -n 19 "rm -f /tmp/baz.csv"
    ionice might be of help as well, but I'm not sure whether it's installed on the server. Something that Andrew does know for sure.

    Be sure to restrict baz_import to a separate baz database and give it only LOAD privileges for some extra security.

  5. #5
    Senior Member
    Join Date
    Apr 2004
    Location
    Manchester, UK
    Posts
    168

    Default

    Thanks for the advice ZYV - much appreciated.

    I'd kind of given up on this until the site is out of beta (at which point we're planning to move it to a VPS or dedicated box). But I'll give this a try - thanks again.

  6. #6
    Senior Member
    Join Date
    Sep 2005
    Location
    Russia, Nizhny Novgorod
    Posts
    315

    Default

    Unix is not as complicated as it seems to be for the uninitiated In fact after you master it to a certain extent all becomes so blatantly clear that you can't even admit the fact that awhile ago you were thinking that it is difficult to understand and all the systems that you've been using before all of a sudden become ridiculously suboptimal and irritating.

    From my own experience. Good luck!

    Season greetings,
    Z.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •