This software module assists in identifying the sequence formatting of a given content, be it in file, string or filehandle format, helping users to streamline their workflow and improve the accuracy of data processing.
One way to guess the format of a flat file, given a filename, is through the following code snippet:
```
my $guesser = Bio::Tools::GuessSeqFormat->new( -file => $filename );
my $format = $guesser->guess;
```
Alternatively, guessing the format from an already open filehandle is also possible with the following code:
```
my $guesser = Bio::Tools::GuessSeqFormat->new( -fh => $filehandle );
my $format = $guesser->guess;
```
If the filehandle is seekable (STDIN isn't), it will be returned to its original position. The module can also guess the format of one or several lines of text (with embedded newlines). Simply use the code below:
```
my $guesser = Bio::Tools::GuessSeqFormat->new( -text => $linesoftext );
my $format = $guesser->guess;
```
Another way to use Bio::Tools::GuessSeqFormat is to create a Bio::Tools::GuessSeqFormat object and set the filename, filehandle, or line to parse afterwards:
```
my $guesser = Bio::Tools::GuessSeqFormat->new();
$guesser->file($filename);
$guesser->fh($filehandle);
$guesser->text($linesoftext);
```
To guess in one go, given a filename, this code can be used:
```
my $format = Bio::Tools::GuessSeqFormat->new( -file => $filename )->guess;
```
Bio::Tools::GuessSeqFormat tries to guess the format ("swiss", "pir", "fasta" etc.) of the sequence or MSA in a file, in a scalar, or through a filehandle. The guess() method of a Bio::Tools::GuessSeqFormat object will examine the data, line by line, until it finds a line to which only one format can be assigned. If no conclusive guess can be made, undef is returned.
Overall, Bio::Tools::GuessSeqFormat is a useful module for anyone working with biological data and needing to determine the format of a file or sequence in a quick and efficient manner.
Version 1.6.0: N/A