Split NMR-style multiple model pdb files into individual models: Difference between revisions
Jump to navigation
Jump to search
m (added Python version) |
|||
Line 39: | Line 39: | ||
if($line =~ /^ATOM/ || $line =~ /^HETATM/) {print OUT "$line"} | if($line =~ /^ATOM/ || $line =~ /^HETATM/) {print OUT "$line"} | ||
} | } | ||
== Python script == | |||
For this kludgy version using Python 2.x, you need to paste the entire PDB file into the script where it says "PASTE YOUR PDB FILE TEXT HERE".</br> | |||
You can fork [https://github.com/fomightez/structurework/blob/master/python_scripts/super_basic_multiple_model_PDB_file_splitter.py the code here at Github].</br> (Eventually, I hope to have a more full-featured version there that you can just point at your file using an argument at the command line, and after that a web-hosted service to do it for you right on a web page.) | |||
PDB_text = """ | |||
PASTE YOUR PDB FILE TEXT HERE | |||
""" | |||
model_number = 1 | |||
new_file_text = "" | |||
for line in filter(None, PDB_text.splitlines()): | |||
line = line.strip () #for better control of ends of lines | |||
if line == "ENDMDL": | |||
# save file with file number in name | |||
output_file = open("model_" + str(model_number) + ".pdb", "w") | |||
output_file.write(new_file_text.rstrip('\r\n')) #rstrip to remove trailing newline | |||
output_file.close() | |||
# reset everything for next model | |||
model_number += 1 | |||
new_file_text = '' | |||
elif not line.startswith("MODEL"): | |||
new_file_text += line + '\n' | |||
Back to [[Useful scripts (aka smart piece of code)]] | Back to [[Useful scripts (aka smart piece of code)]] |
Revision as of 16:10, 22 October 2014
This assumes that you have a correctly formatted pdb file that contains both MODEL and ENDMDL records.
Bash/awk one-liner
This one-liner splits the file models.pdb into individual pdb files named model_###.pdb.
grep -n 'MODEL\|ENDMDL' models.pdb | cut -d: -f 1 | \ awk '{if(NR%2) printf "sed -n %d,",$1+1; else printf "%dp models.pdb > model_%03d.pdb\n", $1-1,NR/2;}' | bash -sf
Bash script
i=1 while read -a line; do echo "${line[@]}" >> model_${i}.pdb [[ ${line[0]} == ENDMDL ]] && ((i++)) done < /path/to/file.pdb
Awk script
Should be called as
awk -f script.awk < models.pdb
BEGIN {file = 0; filename = "model_" file ".pdb"} /ENDMDL/ {getline; file ++; filename = "model_" file ".pdb"} {print $0 > filename}
Perl script
$base='1g9e';open(IN,"<$base.pdb");@indata = <IN>;$i=0; foreach $line(@indata) { if($line =~ /^MODEL/) {++$i;$file="${base}_$i.pdb";open(OUT,">$file");next} if($line =~ /^ENDMDL/) {next} if($line =~ /^ATOM/ || $line =~ /^HETATM/) {print OUT "$line"} }
Python script
For this kludgy version using Python 2.x, you need to paste the entire PDB file into the script where it says "PASTE YOUR PDB FILE TEXT HERE".
You can fork the code here at Github.
(Eventually, I hope to have a more full-featured version there that you can just point at your file using an argument at the command line, and after that a web-hosted service to do it for you right on a web page.)
PDB_text = """ PASTE YOUR PDB FILE TEXT HERE """ model_number = 1 new_file_text = "" for line in filter(None, PDB_text.splitlines()): line = line.strip () #for better control of ends of lines if line == "ENDMDL": # save file with file number in name output_file = open("model_" + str(model_number) + ".pdb", "w") output_file.write(new_file_text.rstrip('\r\n')) #rstrip to remove trailing newline output_file.close() # reset everything for next model model_number += 1 new_file_text = elif not line.startswith("MODEL"): new_file_text += line + '\n'