- How do I read a PDB file?
Here is an example:
#include <ESBTL/default.h>
//Create one system with all atoms.
ESBTL::PDB_line_selector sel;
std::vector systems;
//Build the system from the pdb file.
ESBTL::All_atom_system_builder builder(systems,sel.max_nb_systems());
ESBTL::read_a_pdb_file(filename,sel,builder,Accept_none_occupancy_policy());
Basically a system has to be defined to contain the PDB structure. The system is then populated by a builder. Each step (system definition, builder...) is customizable.
- How do I print out the atom name?
Here is an example that iterate over all atoms in a model and print out each atom name:
for (ESBTL::Default_system::Model::Atoms_const_iterator it_atm=model.atoms_begin();it_atm!=model.atoms_end();++it_atm){
std::cout << it_atm->atom_name() << std::endl;
}
The same example would work to print out the atom_serial_number, the atom type (element), the charge or the coordinates x,y,z.
- I'd like to get the residue name of an atom?
Using the same type of for loop as above, and having atom iterator it_atm, this would simply look like:
it_atm->residue_name()
That way, for each Model, Chain, Residue and Atom object, it is possible to access its parent in the data structure, System, Model, Chain and Residue respectively.
- How do I iterate over chains?
Iterators are provided to access each level of the System, Model, Chain or Residue. For example to access all the chains in a system:
ESBTL::Default_system::Chain chain;
for (ESBTL::Default_system::Chain::Residues_iterator
it=chain.residues_begin();
it!=chain.residues_end();
++it
)
- How can I select two chains and save them to a PDB file?
Having a system containing many chains, it is possible to only select the chains "A" and "B" and write them to a PDB file as shown in the following example:
typedef ESBTL::Selected_atom_iterator <ESBTL::Default_system::Model,ESBTL::Select_by_resname,true> Restrict_iterator;
ESBTL::Select_by_chainids sel_chn("AB");
std::ofstream output("selection.pdb")
for (Restrict_iterator itr=ESBTL::make_selected_atom_iterator(model.atoms_begin(),sel_resn);
itr!=ESBTL::make_selected_atom_iterator<ESBTL::Select_by_resname> (model.atoms_end());++itr){
output << ESBTL::PDB::get_atom_pdb_format(*itr) <<"\n";
}
output.close();
- How to select "A" and "B" chains at the PDB reading stage and iterate over their atoms?
Here is an example:
std::list chains_to_select;
chains_to_select.push_back("AB");
//Select chains and discard hydrogen atoms and water molecule
ESBTL::PDB_line_selector_chain sel(chains_to_select.begin(),chains_to_select.end(),false,false);
std::vector systems;
unsigned nb_systems=chains_to_select.size();
ESBTL::All_atom_system_builder builder(systems,nb_systems);
ESBTL::read_a_pdb_file("file.pdb",sel,builder,Accept_none_occupancy_policy());
for (ESBTL::Default_system::Model::Atoms_const_iterator it_atm=model.atoms_begin();it_atm!=model.atoms_end();++it_atm){
... //Things we want to do on the atoms
}
- How is occupancy handled by default?
Occupancy is NOT handled by default and the user has to choose. In the examples above, the Accept_none_occupancy_policy is used, meaning all atoms with occupancy not equal to 1 will be rejected. On the contrary, the Accept_all_occupancy_policy will accept all atoms. Again, policies are customizable.
- How is alternate location of atoms handled by default?
By default, the identification of the first atom read containing an alternate location flag is used. To get atoms with alternate location "B", just write something like:
ESBTL::read_a_pdb_file("file.pdb",sel,builder,Accept_none_occupancy_policy(),'B');
- How about PDB files missing the atom type column?
ESBTL can handle any kind of molecule. For some functions (such as is_hydrogen) the element field of an atom is needed. The default PDB reader requires this field and will stop reading the file if this field is not available or empty.
This can be modified by changing the default PDB line format:
//Having defined the following types and values:
std::string filename=...
Line_selector sel=...
Builder builder=...
Occupancy_policy occupancy=...
char altloc=...
//and this class that indicates which fields are mandatory
struct My_mandatory_fields{
static const bool record_name=true;
static const bool atom_serial_number=true;
static const bool atom_name=true;
static const bool alternate_location=false;
static const bool residue_name=true;
static const bool chain_identifier=false;
static const bool residue_sequence_number=true;
static const bool insertion_code=false;
static const bool x=true;
static const bool y=true;
static const bool z=true;
static const bool occupancy=true;
static const bool temperature_factor=true;
static const bool element=false;
static const bool charge_str=false;
static const bool model_number=true;
};
//then the following line is used to read the pdb file (this is what is hidden behind read_a_pdb_file)
ESBTL::Line_reader<ESBTL::PDB::Line_format<My_mandatory_fields>,Line_selector,Builder>(sel,builder).template read<ESBTL::ASCII>(filename,occupancy,altloc);
- How are Hydrogen atoms handled by default?
There is no default distinction between the atoms. They are all considered independently of their type.
Whether hydrogen atoms will be used is declared at the reading/selection stage using a line selector. For example, the following chain selection discards hydrogen atoms (first occurence of false) and water molecules (second occurence of false):
ESBTL::PDB_line_selector_chain sel(chains_to_select.begin(),chains_to_select.end(),false,false);
- How to I can associate a radius to an atom?
You can associate a Van der Waals radius to an atom by associating a property to an atom using the Generic_classifier. By default, the radii set used is that taken from Tsai J, Taylor R, Chothia C, Gerstein M. J Mol Biol. 1999 Jul 2;290(1):253-66. A special constructor is provided to use your own set of radii.
#include < ESBTL/atom_classifier.h >
typedef ESBTL::Generic_classifier < ESBTL::Radius_of_atom <double,ESBTL::Default_system::Atom> >
for (ESBTL::Default_system::Model::Atoms_const_iterator it_atm=model.atoms_begin();
it_atm!=model.atoms_end(); ++it_atm)
double radius=atom_classifier.get_properties(*it_atm).value();
- How do I can define a coarse grain representation of a protein?
You need to declare a system that can handle coarse grain residue and atoms, and a coarse grain creator.
#include <ESBTL/coarse_grain.h>
#include <ESBTL/coarse_creators.h>
//a default system with coarse grain residues and atoms.
//This system provides both the classical all-atoms system and the coarse grain one
typedef ESBTL::Default_system_with_coarse_grain System;
//occupancy policy
typedef ESBTL::Accept_none_occupancy_policy<ESBTL::PDB::Line_format<> > Accept_none_occupancy_policy;
ESBTL::PDB_line_selector_two_systems sel;
std::vector<System> systems;
ESBTL::All_atom_system_builder<System> builder(systems,sel.max_nb_systems());
//class responsible for building a coarse grain representation for proteins
//using at most two pseudo atoms per residue: one as the barycenter of
//the backbone heavy atoms, and one as the barycenter of side-chain heavy atoms (except for glycine).
ESBTL::Coarse_creator_two_barycenters<System::Residue> creator;
//read pdb file
ESBTL::read_a_pdb_file(argv[1],sel,builder,Accept_none_occupancy_policy());
System::Model& model=*systems[0].models_begin();
//create coarse atoms
for (System::Model::Residues_iterator it_res=model.residues_begin();
it_res!=model.residues_end();++it_res)
it_res->create_coarse_atoms(creator); //creates coarse atoms for this residue.
//iterator type over coarse grain atoms
typedef ESBTL::Coarse_atoms_iterators<System::Model>::const_iterator Coarse_atoms_iterator;
for (Coarse_atoms_iterator itc=coarse_atoms_begin(model);
itc!=coarse_atoms_end(model);++itc)