Skip to content

[Sherpa] Add pre-packaged skills with persistence and subagent config for SMAI spaces#1143

Merged
TRNWWZ merged 6 commits intoaws:mainfrom
aws-srijach:skills-integration-v2
Apr 22, 2026
Merged

[Sherpa] Add pre-packaged skills with persistence and subagent config for SMAI spaces#1143
TRNWWZ merged 6 commits intoaws:mainfrom
aws-srijach:skills-integration-v2

Conversation

@aws-srijach
Copy link
Copy Markdown
Contributor

@aws-srijach aws-srijach commented Apr 7, 2026

Description

Pre-package SageMaker AI skills from github (https://github.com/awslabs/agent-plugins/tree/main/plugins/sagemaker-ai/skills) into the SMD v4 image and sync them to user EBS on SMAI space startup with checksum-based persistence.

  • Bundle skills from awslabs/agent-plugins into /etc/sagemaker/skills/ during Docker build
  • Add sync_skills.sh for checksum-based skills persistence on EBS
  • Add sagemaker_default subagent config
  • Configure skills sync, subagent, and default agent on SMAI space startup
  • Disable old Q chat extension for SMAI spaces

Type of Change

  • Image update - Bug fix
  • Image update - New feature
  • Image update - Breaking change
  • SMD image build tool update
  • Documentation update

Release Information

Does this change need to be included in patch version releases? By default, any pull requests will only be added to the next SMD image minor version release once they are merged in template folder. Only critical bug fix or security update should be applied to new patch versions of existed image minor versions.

  • Yes (Critical bug fix or security update)
  • No (New feature or non-critical change)
  • N/A (Not an image update)

If yes, please explain why:
[Explain the criticality of this change and why it should be included in patch releases]

How Has This Been Tested?

[Describe the tests you ran]

Checklist:

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works

Test Screenshots (if applicable):

Related Issues

MCP integration PR: https://github.com/aws/sagemaker-distribution/pull/1108/commits
(separate, not included here)

Additional Notes

Need to add jupyter ai v3

… spaces

- Bundle skills from awslabs/agent-plugins into /etc/sagemaker/skills/ during Docker build
- Add sync_skills.sh for checksum-based skills persistence on EBS
- Add sagemaker_default subagent config
- Configure skills sync, subagent, and default agent on SMAI space startup
- Disable old Q chat extension for SMAI spaces
@aws-srijach aws-srijach requested a review from a team as a code owner April 7, 2026 19:51
Comment thread template/v4/Dockerfile Outdated
rm -rf /tmp/kirocli.zip /tmp/kirocli && \
: && \
# Clone SageMaker AI skills from GitHub
git clone --depth 1 https://github.com/awslabs/agent-plugins.git /tmp/agent-plugins && \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK for now. can we see how we can write this in a way that easily scales if we have more github repos? like a config or list of the skills, and then the code reads from this and clones them?

{
"name": "sagemaker_default",
"description": "SageMaker AI assistant with pre-installed skills for model customization, evaluation, deployment, and HyperPod operations.",
"tools": ["@builtin"],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all tools, this is what tools are available

"name": "sagemaker_default",
"description": "SageMaker AI assistant with pre-installed skills for model customization, evaluation, deployment, and HyperPod operations.",
"tools": ["@builtin"],
"allowedTools": [],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add read-only Jupyter MCP tools

bash /etc/sagemaker/skills/sync_skills.sh || echo "Warning: skills sync failed, continuing..."

# Install subagent config (always overwrite)
mkdir -p "$HOME/.kiro/agents"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we also have a sync_agent.sh

IMAGE_SKILLS_DIR="/etc/sagemaker/skills"
EBS_SKILLS_DIR="$HOME/.agent/skills"
LOCK_FILE="$EBS_SKILLS_DIR/.sagemaker-lock"
KIRO_SKILLS_DIR="$HOME/.kiro/skills"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we parameterize this so it would be easy to add claude folder in future? can also do in follow up if/when we add claude

# Setup skills and subagent for SMAI spaces
if [ -n "$SAGEMAKER_APP_TYPE_LOWERCASE" ]; then
# Sync pre-packaged skills to EBS
bash /etc/sagemaker/skills/sync_skills.sh || echo "Warning: skills sync failed, continuing..."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do errors get printed?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this log only available to customer? Thinking whether we should create metrics on this.

…rameterized agent targets, MCP tools

- Replace hardcoded git clone with skills-manifest.json config for scalability
- Extract agent config setup into sync_agent.sh
- Parameterize AGENT_SKILLS_DIRS array in sync_skills.sh for future agents
- Add read-only Jupyter MCP tools to allowedTools in subagent config
- Add 2>&1 to surface errors in startup logs
- Add mcp.json with jupyter_mcp_server definition
- Update agent config: allowedTools with @jupyter_mcp_server/ prefix,
  includeMcpJson, skills resource glob, system prompt
- Update sync_agent.sh to sync mcp.json to ~/.kiro/settings/
"name": "sagemaker_default",
"description": "SageMaker AI assistant with pre-installed skills for model customization, evaluation, deployment, and HyperPod operations.",
"tools": ["@builtin"],
"allowedTools": [
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs update based on what we have in jupyter MCP

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(ingore this already fixed)

…to script

- skills-manifest.json lives in dirs/ and only reaches /etc/sagemaker/skills/
  after COPY dirs/ + rsync. The jq command was reading it before it existed.
- Move git clone block to after rsync where the manifest is available.
- Extract inline clone logic into clone_skills.sh (set -eu for proper error
  propagation, consistent with sync_skills.sh/sync_agent.sh pattern).

# Setup skills and subagent for SMAI spaces
if [ -n "$SAGEMAKER_APP_TYPE_LOWERCASE" ]; then
# Setup skills and subagent for SMAI private spaces only
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can do skills and subagent in shared spaces TBH. lets keep this. lets just turn off chat in shared spaces.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK for now though we can follow up on this

@TRNWWZ TRNWWZ merged commit fa86aca into aws:main Apr 22, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants