×
All
Videos 🎬
Audios 🎵
eBooks 📚
Mobile Apps 📱
Archives (ZIP/ISO) 💿
FILE
PURSUIT
Home
About
Contact
Discover
+
Submit
Lights ON
Extending Context Window of Large Language Models via Positional Interpolation_ arxiv2306.15595.pdf
Home
/
Discover
/
erewhon.superkuh.com
/
library
/
Computing
/
transformers
/
Parent Folder
Extending Context Window of Large Language Models via Positional Interpolation_ arxiv2306.15595.pdf
PDF
1 year ago
733.61 kB
Copy Link
Efficient streaming language models with attention sinks_ arxiv2309.17453.pdf
PDF
1 year ago
11.81 MB
Copy Link
GLU Variants Improve Transformer_arxiv2002.05202.pdf
PDF
1 year ago
106.58 kB
Copy Link
GQA_ Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints_ arxiv2305.13245.pdf
PDF
1 year ago
248.19 kB
Copy Link
Climbing towards Natural Language Understanding_ On Meaning Form and Understanding in the Age of Data_ Emily M Bender- Alexander Koller_ 2020.pdf
PDF
1 year ago
472.23 kB
Copy Link
Are Emergent Abilities of Large Language Models a Mirage_ arxiv2304.15004.pdf
PDF
1 year ago
1.82 MB
Copy Link
An Ultra-Low Energy Internally Analog, Externally Digital Vector-Matrix Multiplier Based on NOR Flash Memory Technology_ M Reza Mahmoodi_ Dmitri Strukov_ 2018.pdf
PDF
1 year ago
1.72 MB
Copy Link
LLaMA_ Open and Efficient Foundation Language Models_ arxiv2302.13971.pdf
PDF
1 year ago
709.54 kB
Copy Link
Landmark Attention_ Random-Access Infinite Context Length for Transformers_ arxiv2305.16300.pdf
PDF
1 year ago
500.21 kB
Copy Link
Llama 2_ Open Foundation and Fine-Tuned Chat Models_ arxiv2307.09288.pdf
PDF
1 year ago
13.03 MB
Copy Link
Photonic Matrix Computing_ From Fundamentals to Applications_ Junwei Cheng_ Hailong Zhou_ Jianji Dong_ Nanomaterials 2021.pdf.pdf
PDF
1 year ago
3.45 MB
Copy Link
RoFormer_ Enhanced Transformer with Rotary Position Embedding_ arxiv2104.09864v4.pdf
PDF
1 year ago
572.58 kB
Copy Link
SentencePiece_ A simple and language independent subword tokenizer and detokenizer for Neural Text Processing_ arxiv1808.06226.pdf
PDF
1 year ago
206.70 kB
Copy Link
Stay on topic with Classifier-Free Guidance_ arxiv2306.17806.pdf
PDF
1 year ago
1.91 MB
Copy Link
The Curse of Recursion_ Training on Generated Data Makes Models Forget_ arxiv2305.17493.pdf
PDF
1 year ago
2.24 MB
Copy Link
The Poison of Alignment_ arxiv2308.13449.pdf
PDF
1 year ago
185.34 kB
Copy Link
The Transformer Model in Equations_ John Thickstun_ 2023.pdf
PDF
1 year ago
Get Size
Copy Link
The case for 4-bit precision_ k-bit Inference Scaling Laws_ arxiv2212.09720.pdf
PDF
1 year ago
884.70 kB
Copy Link
Train Short, Test Long_ Attention with Linear Biases Enables Input Length Extrapolation_arxiv2108.12409.pdf
PDF
1 year ago
741.24 kB
Copy Link
Unigram Algorithm_ Subword Regularization_ Improving Neural Network Translation Models with Multiple Subword Candidates_ arxiv1804.10959.pdf
PDF
1 year ago
321.81 kB
Copy Link
Join FilePursuit on
chat for discussions and more information.