Today, we are delighted to announce that DeepSeek R1 distilled Llama and Qwen designs are available through Amazon Bedrock Marketplace and Amazon SageMaker JumpStart. With this launch, you can now deploy DeepSeek AI's first-generation frontier model, DeepSeek-R1, along with the distilled variations ranging from 1.5 to 70 billion specifications to build, experiment, and responsibly scale your generative AI concepts on AWS.
In this post, we demonstrate how to begin with DeepSeek-R1 on Amazon Bedrock Marketplace and SageMaker JumpStart. You can follow comparable actions to deploy the distilled versions of the designs also.
Overview of DeepSeek-R1
DeepSeek-R1 is a large language design (LLM) established by DeepSeek AI that uses support discovering to enhance thinking capabilities through a multi-stage training process from a DeepSeek-V3-Base structure. A key distinguishing function is its reinforcement learning (RL) action, which was utilized to refine the model's actions beyond the standard pre-training and fine-tuning procedure. By including RL, DeepSeek-R1 can adapt more effectively to user feedback and goals, eventually enhancing both significance and clarity. In addition, DeepSeek-R1 utilizes a chain-of-thought (CoT) technique, indicating it's geared up to break down complicated questions and reason through them in a detailed manner. This assisted reasoning process permits the model to produce more accurate, transparent, and detailed responses. This model integrates RL-based fine-tuning with CoT capabilities, aiming to produce structured actions while focusing on interpretability and user interaction. With its comprehensive abilities DeepSeek-R1 has actually captured the industry's attention as a versatile text-generation design that can be incorporated into numerous workflows such as representatives, logical thinking and jobs.
DeepSeek-R1 utilizes a Mix of Experts (MoE) architecture and is 671 billion parameters in size. The MoE architecture permits activation of 37 billion parameters, making it possible for efficient inference by routing queries to the most relevant professional "clusters." This method allows the design to focus on various issue domains while maintaining overall effectiveness. DeepSeek-R1 requires a minimum of 800 GB of HBM memory in FP8 format for reasoning. In this post, we will utilize an ml.p5e.48 xlarge circumstances to release the model. ml.p5e.48 xlarge includes 8 Nvidia H200 GPUs providing 1128 GB of GPU memory.
DeepSeek-R1 distilled designs bring the thinking capabilities of the main R1 design to more efficient architectures based on popular open designs like Qwen (1.5 B, 7B, 14B, and 32B) and Llama (8B and 70B). Distillation describes a procedure of training smaller sized, more efficient designs to simulate the behavior and reasoning patterns of the larger DeepSeek-R1 model, utilizing it as an instructor model.
You can deploy DeepSeek-R1 model either through SageMaker JumpStart or Bedrock Marketplace. Because DeepSeek-R1 is an emerging model, we advise releasing this design with guardrails in place. In this blog site, we will use Amazon Bedrock Guardrails to present safeguards, prevent harmful content, and assess designs against crucial safety criteria. At the time of composing this blog, for DeepSeek-R1 releases on SageMaker JumpStart and Bedrock Marketplace, Bedrock Guardrails supports only the ApplyGuardrail API. You can produce numerous guardrails tailored to different use cases and use them to the DeepSeek-R1 design, improving user experiences and standardizing security controls throughout your generative AI applications.
Prerequisites
To release the DeepSeek-R1 design, you need access to an ml.p5e circumstances. To inspect if you have quotas for P5e, [forum.batman.gainedge.org](https://forum.batman.gainedge.org/index.php?action=profile
1
DeepSeek R1 Model now Available in Amazon Bedrock Marketplace And Amazon SageMaker JumpStart
carlotaabreu9 edited this page 2 months ago