Skip to content
View LoserCheems's full-sized avatar
๐Ÿถ
I am loser cheems
๐Ÿถ
I am loser cheems

Block or report LoserCheems

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
LoserCheems/README.md

Jingze Shi

Experience ๐Ÿ•

  • 2022.9-Present Undergraduate Student

Competition Awards ๐Ÿ†

Publications ๐Ÿ“

  • OmniMoE: An Efficient MoE by Orchestrating Atomic Experts at Scale [Paper]
  • Towards Automated Kernel Generation in the Era of LLMs [Paper]
  • Trainable Dynamic Mask Sparse Attention [Paper]
  • Concise Reasoning, Big Gains: Pruning Long Reasoning Trace with Difficulty-Aware Prompting [Paper]

Research Direction ๐Ÿ”ญ

  • Natural Language Processing
  • Large Language Models
  • Small Language Models
  • Foundation Models
  • Deep Reinforcement Learning
  • High Efficient Algorithm

Skills โš’๏ธ

  • Natural Language: ็ฎ€ไฝ“ไธญๆ–‡, English
  • Programming Language: C++, Python
  • Typesetting Language: Markdown, LaTeX
  • Programming Framework: PyTorch, Transformers

Pinned Loading

  1. huggingface/transformers huggingface/transformers Public

    ๐Ÿค— Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

    Python 158k 32.5k

  2. pytorch/pytorch pytorch/pytorch Public

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    Python 98.3k 27.2k

  3. huggingface/trl huggingface/trl Public

    Train transformer language models with reinforcement learning.

    Python 17.7k 2.6k

  4. Dao-AILab/flash-attention Dao-AILab/flash-attention Public

    Fast and memory-efficient exact attention

    Python 22.8k 2.5k

  5. flash-algo/flash-sparse-attention flash-algo/flash-sparse-attention Public

    Trainable fast and memory-efficient sparse attention

    Python 559 54

  6. flash-algo/omni-moe flash-algo/omni-moe Public

    An Efficient MoE by Orchestrating Atomic Experts at Scale

    Python 105 2