Friday, December 9, 2011

OBMol de-/serialization revisited

The current serialization/deserialization mechanism in pgchem and mychem does not preserve stereochemistry of OBMol objects properly. As Tim Vandermeersch wrote:

The OBChiralData isn't used anymore. Also the functions OBAtom::IsClockwise, and OBAtom::IsAntiClockwise are obsolate. Instead, you should serialize the OBCisTransStereo and OBTetrahedralStereo data objects.

Fortunately (for me, since I don't have the time at the moment), Jérôme Pansanel has:

...started the serialization of the OBCisTransStereo and OBTetrahedralStereo objects.

For the time being, I have tried the following workaround and it seems to be working well. First, I have removed now unneccessary code from the unserialization:

bool unserializeOBMol(OBBase* pOb, const char *serializedInput)
{
  OBMol* pmol = pOb->CastAndClear<OBMol>();
  OBMol &mol = *pmol;
  unsigned int i,natoms,nbonds;

  unsigned int *intptr = (unsigned int*) serializedInput;

  ++intptr;

  natoms = *intptr;

  ++intptr;

  nbonds = *intptr;

  ++intptr;

  _ATOM *atomptr = (_ATOM*) intptr;

  mol.ReserveAtoms(natoms);

  OBAtom atom;
  int stereo;

  for (i = 1; i <= natoms; i++) {
    atom.SetIdx(atomptr->idx);
    atom.SetHyb(atomptr->hybridization);
    atom.SetAtomicNum((int) atomptr->atomicnum);
    atom.SetIsotope((unsigned int) atomptr->isotope);
    atom.SetFormalCharge((int) atomptr->formalcharge);
    stereo = atomptr->stereo;

    if(stereo == 3) {
      atom.SetChiral();
    }

    atom.SetSpinMultiplicity((short) atomptr->spinmultiplicity);

    if(atomptr->aromatic != 0) {
      atom.SetAromatic();
    }

    if (!mol.AddAtom(atom)) {
      return false;
    }

    atom.Clear();

    ++atomptr;
  }

  _BOND *bondptr = (_BOND*) atomptr;

  unsigned int start,end,order,flags;

  for (i = 0;i < nbonds;i++) {
    flags = 0;

    start = bondptr->beginidx;
    end = bondptr->endidx;
    order = (int) bondptr->order;

    if (start == 0 || end == 0 || order == 0 || start > natoms || end > natoms) {
      return false;
    }

    order = (unsigned int) (order == 4) ? 5 : order;

    stereo = bondptr->stereo;

    if (stereo) {
      if (stereo == 1) {
        flags |= OB_WEDGE_BOND;
      }
      if (stereo == 6) {
        flags |= OB_HASH_BOND;
      }
    }

    if (bondptr->aromatic != 0) {
      flags |= OB_AROMATIC_BOND;
    }

    if (!mol.AddBond(start,end,order,flags)) {
      return false;
    }

    ++bondptr;
  }

  intptr = (unsigned int*) bondptr;

  mol.SetAromaticPerceived();
  mol.SetKekulePerceived();

  return true;
}
Then, when matching, I check if the query might contain tetrahedral stereo. If yes, I build the target OBMol fresh from SMILES, if no directly from the serialized object:
 if (strchr (querysmi, '@') != NULL)
    {
        //Match against an OBMol generated from SMILES
    }
    else
    {
         //Match against an OBMol deserialized from binary
    }

No comments:

Post a Comment