Thursday, May 15, 2014

Google Drive orphan files

Google Drive is not your traditional filesystem.. 

And... it so happened that I had shared a folder with my wife and from her account had copied a big chunk of our picture album into my folder before realizing that I was exceeding her quota..

At that moment I cancelled the operation and deleted the folder from my account.

However in Google Drive somehow files have more of a database like approach, in fact Google Drive even allows for a folder to contain multiple files with the same name (because they have different IDs).

What happened after I deleted my folder with her files was that all of her files remained in her account (and thus taking quota) but nowhere to be found (because I deleted my 'parent' folder).

I googled but could not find any solution, even Google's own specific link (https://drive.google.com/#query?view=2&orphans=0) didn't work.

I found that I could go see the files on the 'All items' option of Google Drive ( in detailed view the files have the parent folder name right after them, in orphan files there's nothing).
However to select manually over 9000 files and then move each bunch of files manually from the "All items" was taking forever.. 

At some point I found this guy's webpage:

It claimed to fix things but it wasn't working.. so I saw that I had to take it into my own hands..

Brought up eclipse and started on it..

The first tutorial was straight forward:

After that, I had to locate all files.
With the demo running, I just added the sample here: https://developers.google.com/drive/v2/reference/files/list

After that, to locate the orphaned files, it was simple once I saw that Parents is an attribute of files.

try {
 FileList files = request.execute();
 for (File f : files.getItems()) {
  if (f.getParents().size() == 0) 
    orphans.add(f);

To play it safe I had to check for repeated filenames. Filenames in Goole Drive are "Titles" and as said before they can be repeated (I did not dare test how it would show in your explorer's replica of Google Drive if you have two files with the same name on the same folder).

First, there was an easy to copy source code somewhere in the API guide about how to rename files (it couldn't be as easy as file.setTitle :) )

 private static File renameFile(Drive service, String fileId, String newTitle) {
  try {
   File file = new File();
   file.setTitle(newTitle);
   file.setOriginalFilename(newTitle);

   // Rename the file.
   Files.Patch patchRequest = service.files().patch(fileId, file);

   File updatedFile = patchRequest.execute();
   return updatedFile;
  } catch (IOException e) {
   System.out.println("An error occurred: " + e);
   return null;
  }
 }
}

With that out of the way, I could now rename files that shared the same name:

 List<File> list = retrieveAllFilesWithoutParents(service);
 HashMap<String, Vector<File>> filemap = new HashMap<String, Vector<File>>();

 for (File f : list) {
  String name = getName(f);
  if (!filemap.containsKey(name)) {
   filemap.put(name, new Vector<File>());
  }
  filemap.get(name).add(f);
 }

 for (String name : filemap.keySet()) {
  if (filemap.get(name).size() != 1) {
   System.out.println(name + " count is "
      + filemap.get(name).size());
   String[] sp = name.split("\\.(?=[^\\.]+$)");
   int count = 0;
   for (File f : filemap.get(name)) {
    String newname = sp[0] + "_" + ++count + "." + sp[1];
    System.out.println(name + " will be renamed to " + newname);
    renameFile(service, f.getId(), newname);
(the code above basically ensure that any repeated filename (e.g. a.txt) will be renamed acordingly (e.g. a_1.txt, a_2.txt...)

I didn't dare moving the files immediately after renaming because I was unsure how Google's eventual consistency would react, so I waited a few seconds an then ran the code to move the files into a new folder.

Creating the new folder was easy:
File body = new File();
String folder_name = "OrphanFiles_" + System.currentTimeMillis();
body.setTitle(folder_name);
body.setMimeType("application/vnd.google-apps.folder");
File folder = service.files().insert(body).execute();

And moving the files into the folder:
List<File> list = retrieveAllFilesWithoutParents(service);
for (File f : list) {
 System.out.println("Moving "+ getName(filemap.get(name).elementAt(0))   
      + " into "+ folder_name);
 insertFileIntoFolder(service, folder.getId(), filemap.get(name)
      .elementAt(0).getId());
}

That was all, I hope you find it useful.

2 comments:

  1. Did you execute this as one function in eclipse or in terminal?

    ReplyDelete