27 November 2009

PHP: How to Compare Directory Attributes & all files within

I wrote this function today to compare the contents of two folders or directories based on 3 comparison points:

  • Do files exist in both directories?
  • Were they both modified at the same time?
  • Are the contents identical?

This function returns a 3D array containing a list of all differences between the contents of the two directories...


Please see right hand column before copying >


//SAMPLE USAGE:
print_r(returnDirDifferences("folder/subfolder","folder2/subfolder2/other");


function returnDirDifferences($dir1, $dir2='.'){
 if(!$dir1){return(false);}
 $ignore = array('.', '..');
 $returnarray=array();
 $dh = @opendir($dir1);
 while(false !== ($file = readdir($dh))){ // Loop through the directory
  if(!in_array($file, $ignore)){
   if(is_dir("$dir1/$file")){
    if(is_dir("$dir2/$file")){// Its a directory, so keep reading down:
     $tempresults = returnDirDifferences("$dir1/$file","$dir2/$file");
     if($tempresults[0]){
      $returnarray = array_merge($returnarray,$tempresults);
     }
    }else{
     array_push($returnarray,array('path1'=>"$dir1/$file",'path2'=>0,'status'=>"non-existing"));
    }
   } else {
    $tempresults = returnFileDifferences("$dir1/$file","$dir2/$file");
    if($tempresults){array_push($returnarray,$tempresults);}
   }
  }
 }
 closedir($dh);
 //return 3D array of differences or false if there are none:
 if($returnarray[0]){return($returnarray);}else{return(false);}
}
function returnFileDifferences($path1,$path2){
  $path1=(file_exists($path1) ? $path1 : 0);
  $path2=(file_exists($path2) ? $path2 : 0);
  if(!$path1 || !$path2){//check existance:
     return array('path1'=>$path1,'path2'=>$path2,'status'=>'non-existing');
  }else if(filemtime($path1)!=filemtime($path2)){//check file modified time:
     return array('path1'=>$path1,'path2'=>$path2,'status'=>'modified');
  }else if (strtoupper(dechex(crc32(file_get_contents($path1))))!=strtoupper(dechex(crc32(file_get_contents($path2))))){ //check contents:
     return array('path1'=>$path1,'path2'=>$path2,'status'=>'contents'); 
  }//no differences:
 
  return false;
}

8 comments:

  1. It turns out PHP alters the modified time on files when moving them so checking the modified time in this case is not accurate to tell if two files are the same. To fix this, simply remove these two lines from the function:

    else if(filemtime($path1)!=filemtime($path2)){//check file modified time:
    return array('path1'=>$path1,'path2'=>$path2,'status'=>'modified');
    }

    ReplyDelete
  2. Recursive results didn't seem to make it up the chain, so I changed this line

    array_merge($returnarray,$tempresults);

    to

    $returnarray = array_merge($returnarray,$tempresults);

    Now, recursive results show up in the final array.

    Excellent function, by the way.

    Best Regards,

    Marc

    ReplyDelete
  3. Thanks Marc - looks like this is a compatibility issue with PHP5. I've updated the function above but anyone using PHP4 will likely need to revert to the old line:
    array_merge($returnarray,$tempresults);

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. You can check filesize first; then crc32/md5/sha. I would not rely on dates.

    ReplyDelete
  6. It turns out PHP alters the modified time on files when moving them so checking the modified time in this case is not accurate to tell if two files are the same. To fix this, simply remove these two lines from the function:

    else if(filemtime($path1)!=filemtime($path2)){//check file modified time:
    return array('path1'=>$path1,'path2'=>$path2,'status'=>'modified');
    }
    We are a reliable and afforadable trucking comapany.

    ReplyDelete