Download csv and import into database

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Elite
    Senior Member
    • Apr 2004
    • 168

    Download csv and import into database

    I need to download a large csv file (300 mb uncompressed) and import it into mysql with cron.

    Now, I had a php script that downloaded the file in chunks and savig it to the /tmp directory using:

    PHP Code:
    $handle fopen($url"rb");
    while (!
    feof($handle)) {
        
    $contents fread($handle8192);
        
    $handle2 fopen($file'a');
        
    fwrite($handle2$contents);
        
    fclose($handle2);
    }
    fclose($handle); 
    Then in I'd import it into mysql using:

    Code:
    LOAD DATA LOCAL INFILE '/tmp/filename.csv' REPLACE INTO TABLE table_name
    FIELDS TERMINATED BY ','
    OPTIONALLY ENCLOSED BY '\"'
    LINES TERMINATED BY '\n'
    IGNORE 1 LINES";
    And finally delete the file in /tmp.

    This was running fine on my test box but I got an email "CPU usage persists, continued running of this cron job will result in suspension" when I tested it on the live server.

    The last thing I want to do is harm the server performance for others, so what would be the best way to achieve this or is it not possible in a shared environment?

    Ideally the cron would run once a week (time/day unimportant).

    Thanks for any help
  • AndrewT
    Administrator
    • Mar 2004
    • 3653

    #2
    The downloading via PHP might be part of the problem as it was the PHP script itself which was using a full CPU core for 3+ minutes at a time before being killed. Try simply using wget or some other alternative means to download the file to the server.

    Having said that, 300MB is still quite a chunk of data. Though running it once a week shouldn't be much of an issue.

    Comment

    • Elite
      Senior Member
      • Apr 2004
      • 168

      #3
      Thanks Andrew & apologies for any problems caused

      Comment

      • ZYV
        Senior Member
        • Sep 2005
        • 315

        #4
        If you want to be server-friendly, write a shell script that would spawn these processes with minimal priority, so that it does not affect other customers. Other than that, I'm pretty sure that a simple shell script will run MUCH faster than your PHP script + PHP engine + mySQL bindings anyway.

        #!/bin/bash

        nice -n 19 "wget http://blah.foo/bar/baz.quux -o /tmp/baz.csv"
        nice -n 19 "mysql -u baz_import -p baz_restricted < /tmp/baz.csv"
        nice -n 19 "rm -f /tmp/baz.csv"
        ionice might be of help as well, but I'm not sure whether it's installed on the server. Something that Andrew does know for sure.

        Be sure to restrict baz_import to a separate baz database and give it only LOAD privileges for some extra security.

        Comment

        • Elite
          Senior Member
          • Apr 2004
          • 168

          #5
          Thanks for the advice ZYV - much appreciated.

          I'd kind of given up on this until the site is out of beta (at which point we're planning to move it to a VPS or dedicated box). But I'll give this a try - thanks again.

          Comment

          • ZYV
            Senior Member
            • Sep 2005
            • 315

            #6
            Unix is not as complicated as it seems to be for the uninitiated In fact after you master it to a certain extent all becomes so blatantly clear that you can't even admit the fact that awhile ago you were thinking that it is difficult to understand and all the systems that you've been using before all of a sudden become ridiculously suboptimal and irritating.

            From my own experience. Good luck!

            Season greetings,
            Z.

            Comment

            Working...