BeeTLe#
A deep learning framework for linear B-cell epitope prediction and antibody type-specific epitope classification using Transformer and LSTM encoders.
Usage#
Command Line#
After installed, run command like below. It takes a few seconds to predict 10000 peptides.
python cli.py -i input.fasta -o output.csv
To show help, run python cli.py -h
. The input is a FASTA file of peptides. The output is a table with following columns:
identifier: FASTA header.
sequence: FASTA sequence.
score: Probability of being epitope.
epitope: {0, 1}. 1 for epitope (score > 0.5).
Ig: {A, E, M}. The antibody most probably binds to in these three types.
Web App#
Without installation, navigate to Streamlit.
Installation#
Linux is preferred. GPU is not required.
Clone this repo and navigate to the repo folder.
Install with pip, preferably in a virtual environment:
pip install -r requirements.txt
Alternatively, to be more specific, use mamba in Linux:
mamba env create -p ./envs -f environment.yml mamba activate ./envs
Data#
Follow the notebook data/dataset.py
to generate datasets, in which redundancy and false negatives are reduced. The raw data is on figshare.
Development#
The code is designed to be reusable and extensible. It may be adopted in other peptide classification tasks. Some useful components are:
Loss functions: logit-adjusted, focal; sigmoid, softmax.
LSTM (packed variable length input), Transformer encoder, attention.
Amino acid encoder.