I need to remove duplicate images from my wordpress media library

I’m trying to remove duplicate images from my wordpress media library.

The posts themselves aren’t duplicate but each attachment image is appearing twice for each post.

Read More

I’ve had a look around but nobody quite has a clear cut answer. Some people have said use something like this:

  <?php 
    $p = get_posts(array('numberposts'=> -1));
    foreach($p as $t) {
      $s = get_children(array('post_type' => 'attachment', 'numberposts' => -1 )); 
      foreach ($s as $u) {
        var_dump($u);
      }
    }
  ?>

But there still seems to be a bit missing, like this brings me a list of attachments but I still don’t know how to compare them.

It seems to me that I need to use some sql queries and delete the media files straight from the database. I’m not quite sure how to go about this though.

Theoretically I need to attempt something like find posts, foreach post get attachments, if attachment filename == filename then delete filename.

Help appreciated.

Related posts

Leave a Reply

1 comment

  1. Create a file in your theme folder and in that paste either the first or second way solutions:

    1st way (Compare against their titles – recommended):

    <?php
    
    require('../../../wp-blog-header.php');
    global $wpdb;
    
    $querys = $wpdb->get_results(
        "
        SELECT a.ID, a.post_title, a.post_type
        FROM $wpdb->posts AS a
           INNER JOIN (
              SELECT post_title, MIN( id ) AS min_id
              FROM $wpdb->posts
              WHERE post_type = 'attachment'
              GROUP BY post_title
              HAVING COUNT( * ) > 1
           ) AS b ON b.post_title = a.post_title
        AND b.min_id <> a.id
        AND a.post_type = 'attachment'
        ");
    
    echo "<style>td {padding:0 0 10px;}</style>";
    echo "<h2>DUPLICATES</h2>n";
    echo "<table>n";
    echo "<tr><th></th><th>Title</th><th>URL</th></tr>n";
    
    foreach ( $querys as $query )
    {
        $attachment_url = wp_get_attachment_url($query->ID);
        $delete_url = get_delete_post_link($query->ID);
        $delete_url = get_delete_post_link($query->ID);
        echo "<tr>
            <td><a style="color: #FFF;background-color: #E74C3C;text-decoration: none;padding: 5px;" target="_blank" href="".$delete_url."">DELETE</a></td>
            <td>".get_the_title($query->ID)."</td>
            <td><a href="".$attachment_url."">".$attachment_url."</a></td>
            </tr>n";
    }
    
    echo "</table>";
    
    ?>
    

    2nd way (Compare against their urls)
    The way it works is, it checks each file against each other. So if you have a file like file.jpg and file1.jpg, then it will catch them as duplicates. Also, if you have a file like file1.jpg and file11.jpg, then it will catch them as duplicates even though they might be completely different files.
    :

    <?php
    
    require('../../../wp-blog-header.php');
    
    $args = array(
        'post_type' => 'attachment',
        'numberposts' => -1,
        'orderby' => 'name',
        'order' => 'ASC',
        'post_status' => null,
        'post_parent' => null, // any parent
    );
    
    $newArray = array();
    
    $attachments = get_posts($args);
    $attachments2 = $attachments;
    
    
    if ($attachments){
    
        foreach($attachments as $post){
            $attachment_url = wp_get_attachment_url($post->ID);
            $delete_url = get_delete_post_link($post->ID);
            $newArray[] = array("att_url" => $attachment_url, "del_url" => $delete_url);
        }
    
        echo "<table>";
            $newArray2 = $newArray;
            echo "<tr><td><h2>DUPLICATES</h2></td></tr>";
            foreach($newArray as $url1){
                $url_del = $url1['del_url'];
                $url1 = $url1['att_url'];
                $url11 = substr($url1,0,strrpos($url1,"."));
                foreach($newArray2 as $url2){
                    $url2 = $url2['att_url'];
                    $url2 = substr($url2,0,strrpos($url2,".") - 1);
    
                    if($url2 == $url11)
                    {
                        echo "<tr><td><a href="".$url_del."">DELETE</a></td><td><a href="".$url1."">".$url1."</a></td></tr>";
                    }
                }
            }
        echo "</table>";
        echo "<table>";
            echo "<tr><td><h2>ALL ATTACHMENTS</h2></td></tr>";
    
            foreach($attachments as $tst1){
                echo "<tr><td>".$tst1->guid."</td></tr>";
            }
        echo "</table>";
    }
    ?>
    

    Then try to access that file in your browser and it will show you a list of all the duplicates.