Split NMR-style multiple model pdb files into individual models: Difference between revisions
mNo edit summary |
mNo edit summary |
||
(2 intermediate revisions by the same user not shown) | |||
Line 47: | Line 47: | ||
You can fork [https://github.com/fomightez/structurework/blob/master/python_scripts/super_basic_multiple_model_PDB_file_splitter.py the code here at Github]. | You can fork [https://github.com/fomightez/structurework/blob/master/python_scripts/super_basic_multiple_model_PDB_file_splitter.py the code here at Github]. | ||
( | (A more full-featured version there that you can just point at your file [,or a folder of files,] using an argument on the command line can be found [https://github.com/fomightez/structurework/blob/master/python_scripts/multiple_model_PDB_file_splitter.py here at Github]. ) | ||
PDB_text = """ | PDB_text = """ | ||
Line 64: | Line 64: | ||
# reset everything for next model | # reset everything for next model | ||
model_number += 1 | model_number += 1 | ||
new_file_text = | new_file_text = "" | ||
elif not line.startswith("MODEL"): | elif not line.startswith("MODEL"): | ||
new_file_text += line + '\n' | new_file_text += line + '\n' |
Latest revision as of 21:14, 3 June 2016
This assumes that you have a correctly formatted pdb file that contains both MODEL and ENDMDL records.
Bash/awk one-linerEdit
This one-liner splits the file models.pdb into individual pdb files named model_###.pdb.
grep -n 'MODEL\|ENDMDL' models.pdb | cut -d: -f 1 | \ awk '{if(NR%2) printf "sed -n %d,",$1+1; else printf "%dp models.pdb > model_%03d.pdb\n", $1-1,NR/2;}' | bash -sf
Bash scriptEdit
i=1 while read -a line; do echo "${line[@]}" >> model_${i}.pdb [[ ${line[0]} == ENDMDL ]] && ((i++)) done < /path/to/file.pdb
Awk scriptEdit
Should be called as
awk -f script.awk < models.pdb
BEGIN {file = 0; filename = "model_" file ".pdb"} /ENDMDL/ {getline; file ++; filename = "model_" file ".pdb"} {print $0 > filename}
Perl scriptEdit
$base='1g9e';open(IN,"<$base.pdb");@indata = <IN>;$i=0; foreach $line(@indata) { if($line =~ /^MODEL/) {++$i;$file="${base}_$i.pdb";open(OUT,">$file");next} if($line =~ /^ENDMDL/) {next} if($line =~ /^ATOM/ || $line =~ /^HETATM/) {print OUT "$line"} }
Python scriptEdit
For this kludgy version using Python 2.x, you need to paste the entire PDB file into the script where it says "PASTE YOUR PDB FILE TEXT HERE".
You can fork the code here at Github.
(A more full-featured version there that you can just point at your file [,or a folder of files,] using an argument on the command line can be found here at Github. )
PDB_text = """ PASTE YOUR PDB FILE TEXT HERE """ model_number = 1 new_file_text = "" for line in filter(None, PDB_text.splitlines()): line = line.strip () #for better control of ends of lines if line == "ENDMDL": # save file with file number in name output_file = open("model_" + str(model_number) + ".pdb", "w") output_file.write(new_file_text.rstrip('\r\n')) #rstrip to remove trailing newline output_file.close() # reset everything for next model model_number += 1 new_file_text = "" elif not line.startswith("MODEL"): new_file_text += line + '\n'