Recombining 64kB blocks of a broken duplicity / deja dup backup after manual decryption

Asked by tomx

I switched from backintime to duplicity / deja dup version 0.6.15 (might have been 0.6.14 when doing the backup) only some weeks ago.

After an update the system broke and was not bootable.
No problem, as i assumed the deja dup backup was there.

I ran into the SHA1 mismatch issue, with 7 out of 699 50MB pgp packages.

I tried out the instructions from https://live.gnome.org/DejaDup/Help/Restore/WorstCase (Thanks for those MichaelTerry!) and was relieved when I saw the folders reappearing.

Now, i have two folders multivol_snapshot and snapshot , the first one containing one folder per file, which then contains 64kb parts of the actual file.

As I wanted to restore as much - and as good - as possible, I looked for scripts that do the 'cat' automatically .

I tested the one from here
https://gist.github.com/1365425
which uses the line
            do("cat \"" + path + "\"/* > \"" + m2s(path) + "\"")

on each folder to recombine the puzzle
It creates the files, but something goes wrong.

Lets say an Image was spread on 4 blocks. If I want to view that image, half of it looks ok, but the rest is missing. Or the image looks picassofied. Funny, at times if, but not a solution to get back your family pictures of past 10 years.

This person describes a similar behavior.
https://bugs.launchpad.net/duplicity/+bug/713832/comments/1

I tried out that shell command in one folder:
  cat `ls [0-9]* | sort -n` > img.jpg
And, Voilá: that creates a valid jpeg out of the many 64kB blocks.

I have no idea of the python syntax, thence the question is:

How do I combine that line
    cat `ls [0-9]* | sort -n` > img.jpg

into the script from
   https://gist.github.com/1365425

so that it preserves the correct filename and can be run automatically, recursing / walking down the entire backup folder structure?

This would be a great improvement for Micheals emergency description at https://live.gnome.org/DejaDup/Help/Restore/WorstCase

Question information

Language:
English Edit question
Status:
Solved
For:
Ubuntu duplicity Edit question
Assignee:
No assignee Edit question
Solved by:
tomx
Solved:
Last query:
Last reply:
Revision history for this message
mycae (mycae) said :
#1

create a quick bash script like so:
ensure that the folders do not have spaces in their names - bash is whitespace sensitive

------- Paste this into a new file, then right click the file, go to permissions and check execute for user--
---- Eg make a file called "recover.sh", then place this into the parent folder, and then go to the terminal and run it with ./recover.sh--

hopefully that should be right

#!/bin/bash
for i in `find ./ -type d`
do
    pushd $i
    cat `ls [0-9]* | sort -n` > img.jpg
    popd
done

Revision history for this message
tomx (tom-xitio) said :
#2

Hi mycae,
thanks for your feedback. I tried your bash skript, apparently it does not work.
I try to find out what it does. I am currently stuck with the different interpreation of whitespace, '"' and "'" in python / bash / ...
Do the pushd/popd command rename the file to the original one (i.e. the folder name)?

In the python script from
https://gist.github.com/1365425
seems to be quite close to what I need and apparently handles with spaces.

In the second step after untaring does the reassembling, recursively, and copying the resulting (frankenstein-) file into the /snapshot folder :

    # Step Two
    # join multipart files
    for (path, dirs, files) in os.walk(TO_MULTI):
        print path
        if len(dirs) == 0:
            # we need to combine all these elements
            do("cat \"" + path + "\"/* > \"" + m2s(path) + "\"") #XXX
            #do("rm *")
        else:
            do("mkdir -p \"" + m2s(path) + "\"")

Actually the only thing missing is to make the line XXX taking the sort option into account that you proposed.
Optimism is back :)

Revision history for this message
tomx (tom-xitio) said :
#3

OK! I wrote a java program that merges the blocks and pushes the files to the snapshot directory from which the backup can be restored to your working operating system.
I paste it for those who might find it useful.
There is one dependency to the commons-io library, remember to add that to the classpath when running it.

import java.io.File;
import java.io.FileFilter;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.Arrays;
import java.util.Comparator;
import java.util.ListIterator;
import java.util.Vector;

import org.apache.commons.io.filefilter.DirectoryFileFilter;

public class DeFrankensteiner {

 static String untaredRoot = "/media/big5wf/untared2test";
 static Vector<File> resultDirs = new Vector<File>();

 public static void main(String[] args) {
  if (args.length>0) {
   if (args[0] != null) {
    untaredRoot = args[0];
    if(!new File(untaredRoot).exists()){
     System.err.println("Directory does not exist");
    }
   }

  } else {
   System.out.println("Program takes two arguments: root folder of the backup and an optional target folder");
   System.out
     .println("Please rerun and specifiy the root of your untared duplicity backup");
   System.out
     .println("The directory contains two folders, 'snapshot' and 'multivol_snapshot'");
   System.exit(0);
  }
  getLeafDirectories(new File(untaredRoot
    + System.getProperty("file.separator") + "multivol_snapshot"));

  ListIterator iter = resultDirs.listIterator();
  while (iter.hasNext()) {
   File sourceDir = (File) iter.next();
   File[] the64KbBlocks = sourceDir.listFiles();
   // We need a non alphabetic, simple higher is better sort
   Arrays.sort(the64KbBlocks, new IntValueComparator());
   String targetFileName = sourceDir.getAbsolutePath().replace(
     "multivol_snapshot", "snapshot");
   System.out.println("Will save file to " + targetFileName
     + " after merging " + the64KbBlocks.length
     + " blocks.");
   // instead of /bin/bash try to use java onboard methods
   try {

    File targe = new File(targetFileName);

    if(targe.exists())targe.delete();

    FileOutputStream fos = new FileOutputStream(targetFileName,
      true);
    int i = 0;
    for (File file : the64KbBlocks) {
     i++;
     FileInputStream fis = new FileInputStream(file);
     byte[] bytesOfA64KbBlock = bytesOfA64KbBlock = getBytesFromFile(file);
     fos.write(bytesOfA64KbBlock);

    }

    System.err.println("Written file file://" + targetFileName);
    fos.close();
   } catch (IOException e) {
    e.printStackTrace();
   }
  }

 }

 public static byte[] getBytesFromFile(File file) throws IOException {
  InputStream is = new FileInputStream(file);

  // Get the size of the file
  long length = file.length();

  // Create the byte array to hold the data
  byte[] bytes = new byte[(int) length];

  // Read in the bytes
  int offset = 0;
  int numRead = 0;
  while (offset < bytes.length
    && (numRead = is.read(bytes, offset, bytes.length - offset)) >= 0) {
   offset += numRead;
  }

  // Ensure all the bytes have been read in
  if (offset < bytes.length) {
   throw new IOException("Could not completely read file "
     + file.getName());
  }

  // Close the input stream and return bytes
  is.close();
  return bytes;
 }

 public static class IntValueComparator implements Comparator<File> {

  public int compare(final File file1, final File file2) {

   int res = 0;
   try {
    res = (Integer.valueOf(file1.getName()) - Integer.valueOf(file2
      .getName()));
   } catch (NumberFormatException e) {
    System.err.println("Something is here that shouldnt be here: "
      + e.getMessage());
    System.err.println(file1.getAbsolutePath());
    System.err.println(file2.getAbsolutePath());
    // e.printStackTrace();
    System.exit(-1);
   }
   return res;
  }

 }

 public static void getLeafDirectories(File dir) {
  File listFile[] = dir.listFiles();
  if (listFile != null) {
   for (int i = 0; i < listFile.length; i++) {
    if (listFile[i].isDirectory()) {

     File[] subdirs = listFile[i]
       .listFiles((FileFilter) DirectoryFileFilter.DIRECTORY);
     if (subdirs.length == 0)
      resultDirs.add(listFile[i]);
     getLeafDirectories(listFile[i]);
    }
   }
  }

 }

}