I just exported a largish WP blog from MediaTemple to PHPFog.
I used the standard WordPress export and import plugins.
For some unknown reason all of my media assets have been duplicated. I now have twice as many images per post.
If an original file was called “Lot-44-Warrens.jpg” it now has a duplicate called “Lot-44-Warrens1.jpg” Both files are attached to the same post.
I now have many duplicate images across about 250+ posts.
So my question is how do I remove said duplicates from the media library and from the posts?
I tried to search the media library with “*1.jpg”, but it didn’t work.
Looking for a neat solution that doesn’t mean removing each dupe manually.
Perhaps there is a MySQL query I can run to remove the dupes from the library and the posts?
The site in question is: http://igrealty.phpfogapp.com/ .
combining the two answer on this page, I found this worked.
Use a run-once script to clean it up. Just an outline, no code:
get_posts( array ( 'numberposts' => -1 ) )
get_children( array ( 'post_type' => 'attachment', 'numberposts' => -1 ) )
wp_get_attachment_url()
$post->post_content
):1
andwp_delete_attachment()
to delete the physical file. This will remove all meta data and all associations in other posts too. It is the best way to remove attached files (imho).This may take a while. Test it on a local copy of your site. Maybe you should run the process in steps of 50 posts (
'numberposts' => 50
).This script will grab all of the attachments in the database, compare the file to one another through md5 and if it finds a duplicate and it has a 1 at the end of the file name it will remove the image:
Just place it in a file in your root directory.
I learn’t a valuable lesson yesterday, if an application does not provide you with adequate functions for finding and removing assets from a database, and you’re trying to find duplicates across multiple, often unique, fields, and you’re unsure how to create complex MySQL queries; then the best bet is to go back to basics.
In the end, I exported the tables with dupes into Excel, filtered them by creating my own “hash” of significant fields ( doing this in MySQL was complex and crashed the server a few times ) and pruned the data set so that I created a list of IDs that I was absolutely sure I wanted to remove. I then built a much more simple MySQL query to delete each row by ID.
This method worked great because I was able to take things slowly and consider each Excel filter I applied. This way I was much more confident I was deleting the correct records. I also have an accurate record in Excel of exactly what I did delete.