-
Notifications
You must be signed in to change notification settings - Fork 1.7k
feat(eagle3):support qwen3 dense model #5879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ced83ff
to
99adbc4
Compare
/bot run |
PR_Github #11435 [ run ] triggered by Bot |
PR_Github #11435 [ run ] completed with state |
@mikeiovine for vis about this Eagle-3 enablement for Qwen3 dense model. |
@xq25478 Can you help to add |
/bot run |
1 similar comment
/bot run |
@xq25478 You miss the sign-off on the second commit and the DCO fails. Can you help to fix it? |
PR_Github #11489 [ run ] triggered by Bot |
67411bd
to
e592199
Compare
fixed! merge into one commit. |
PR_Github #11489 [ run ] completed with state |
e592199
to
1ee2c1c
Compare
/bot run |
1 similar comment
/bot run |
PR_Github #11531 [ run ] triggered by Bot |
PR_Github #11531 [ run ] completed with state |
/bot run --disable-fail-fast |
PR_Github #11587 [ run ] triggered by Bot |
PR_Github #11587 [ run ] completed with state |
1ee2c1c
to
e73e864
Compare
The API is changed in latest codes https://github.com/xq25478/TensorRT-LLM/blob/support_qwen3_dense_eagle3/tensorrt_llm/llmapi/llm_args.py#L337C17-L337C38 You can run LLM_MODELS_ROOT=/tmp/Qwen3/ pytest -s tests/integration/defs/accuracy/test_llm_api_pytorch.py::TestQwen3_8B::test_eagle3 to verify. |
thank you,fixed. |
Signed-off-by: xq25478 <xq25478@qq.com>
a4b7c0d
to
e9ebd36
Compare
/bot run |
PR_Github #12088 [ run ] triggered by Bot |
PR_Github #12088 [ run ] completed with state |
@xq25478 It encounters error when running pre-commit run -a before the CI. Can you help to fix it?
|
Signed-off-by: xq25478 <xq25478@qq.com>
Head branch was pushed to by a user without write access
WalkthroughThe updates introduce speculative decoding support for the Qwen3 model by modifying model classes to accept and propagate Changes
Sequence Diagram(s)sequenceDiagram
participant Test as TestQwen3_8B.test_eagle3
participant LLM as LLM (PyTorch)
participant SpecModel as Eagle3 Model
participant TargetModel as Qwen3-8B Model
participant MMLU as MMLU Evaluator
Test->>LLM: Instantiate with Eagle speculative config\n(spec_model=Eagle3, target_model=Qwen3-8B)
LLM->>SpecModel: Load speculative model
LLM->>TargetModel: Load target model
Test->>LLM: Context enter (with LLM)
Test->>MMLU: Evaluate LLM on MMLU
MMLU->>LLM: Query for predictions
LLM->>MMLU: Return answers (using Eagle speculative decoding)
Test->>LLM: Context exit (cleanup)
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (4)
🔇 Additional comments (4)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
fixed |
/bot run |
1 similar comment
/bot run |
PR_Github #12279 [ run ] triggered by Bot |
PR_Github #12279 [ run ] completed with state |
@xq25478 Thank you for the contribution and patient to handle the issue of CI. This PR is merged. |
Signed-off-by: xq25478 <xq25478@qq.com>
Signed-off-by: xq25478 <xq25478@qq.com>
Signed-off-by: xq25478 <xq25478@qq.com> Signed-off-by: Shreyas Misra <shreyasm@nvidia.com>
feat(eagle3):support qwen3 dense model in eagle_one_model=False
Summary by CodeRabbit
New Features
Tests