I would like to shuffle the pages of a pdf document in a random order.
How can this be done?
I would like to shuffle the pages of a pdf document in a random order.
How can this be done?
Determine the number of pages in the PDF file, then call shuf
to generate a randomized list of page numbers, and call pdftk
again to extract the given sequence of pages.
pdftk original.pdf cat $(shuf 1-$(pdftk original.pdf dump_data | awk '$1=="NumberOfPages:" {print $2}')) output randomized.pdf
#!/usr/bin/env python2
import random, sys
from pyPdf import PdfFileWriter, PdfFileReader
input = PdfFileReader(sys.stdin)
output = PdfFileWriter()
pages = range(input.getNumPages())
random.shuffle(pages)
for i in pages:
output.addPage(input.getPage(i))
output.write(sys.stdout)'
Usage: /path/to/script <original.pdf >randomized.pdf
We will use pdftk
to perform operations on the pdf document.
Create a temporary working directory:
mkdir tmp
Split the pdf document in many one page documents:
pdftk original.pdf burst output tmp/pg_%02d.pdf
Rename the one-page document with random names:
for name in tmp/*.pdf; do
mv "tmp/$name" tmp/$(echo "$name" | sha1sum | cut -f1 -d' ').pdf
done
Merge all the one page documents:
pdftk tmp/*.pdf cat output random.pdf
Clean the temporary working directory:
rm -r tmp
a little improvement to Gilles answer:
pdftk original.pdf cat $(shuf --input-range=1-$(pdftk original.pdf dump_data | awk '$1=="NumberOfPages:" {print $2}')) output randomized.pdf
There is also a version using pdfjoin/pdfjam (like here) by shuffling the sequence of page numbers and using that as input for pdfjoin:
# $1: source pdf file
# $2: last page number to consider
# $3: name of output file
for k in $(seq 1 $2 | shuf); do
PAGES+=($1);
PAGES+=($k);
done
pdfjoin ${PAGES[@]} --outfile $3
This is an old question and probably doesnt come up too often but the answers here are out dated and the package has been changed slightly.
using python3 install the new package: pip3 install PyPDF2
This is my quick and dirty rewrite of the first answer but it works with the new package:
import random, sys
import PyPDF2
#import PdfReader, PdfWriter
path = sys.argv[1]
out = sys.argv[2]
inp = PyPDF2.PdfReader(open(path, 'rb'))
output = PyPDF2.PdfWriter()
pages = len(inp.pages)
page_list = []
for i in range(pages):
page_list.append(i)
random.shuffle(page_list)
for i in page_list:
output.add_page(inp.pages[i])
output.write(out)
To use this script arg 1 is the source script and arg 2 is the output location:
python3 shuffle.py normal.pdf randomized.pdf