Mid/Senior AI engineer with networking experience
B2B Contract 16 000 - 25 000 PLN + VAT
Get to know us better
CodiLime is a software and network engineering industry expert and the first-choice service partner for top global networking hardware providers, software providers and telecoms. We create proofs-of-concept, help our clients build new products, nurture existing ones and provide services in production environments. Our clients include both tech startups and big players in various industries and geographic locations (US, Japan, Israel, Europe).
While no longer a startup - we have 250+ people on board and have been operating since 2011 we’ve kept our people-oriented culture. Our values are simple:
Act to deliver.
Disrupt to grow.
Team up to win.
The project and the team
We're building a software for modern platforms and operating systems, supporting leading networking equipment manufacturers, cloud-native solutions, and infrastructure projects. AI/ML is increasingly at the heart of these initiatives—not as an add-on, but as a core tool to address complex engineering and networking challenges. We are seeking an engineer with a strong software engineering background, solid AI/ML expertise, and experience in computer networks
Your role
As a part of the team, will be responsible for:
Developing MCP-like tools that expose network device APIs and CLI commands with clear descriptions, structured inputs/outputs, validation logic, and error handling
Managing tool metadata and supporting semantic search over available tools using a vector database
Creating golden user queries, expected answers, and query variations for specific tools, intents, and network-operation scenarios
Building automated tests to verify correct tool selection, tool parameterization, output structure, and end-to-end agent responses
Designing evaluation workflows combining deterministic checks, human review, and LLM-as-a-judge techniques, for example using DeepEval or custom evaluation prompts
Refining prompts, tool descriptions, schemas, and agent workflows while monitoring regressions when new tools or changes are introduced
Developing production-quality Python code and tests using frameworks such as LangChain and LangGraph
Collaborating with software engineers, network domain experts, and DevOps teams to deliver reliable, testable, and maintainable agentic workflows
Do we have a match?
As a Mid/Senior AI engineer with networking experience you must meet the following criteria:
AI and development expertise: Hands-on experience with LLM-driven workflows, agentic frameworks such as LangChain and LangGraph, and tool-calling patterns
Agentic tool development: Experience designing structured tools with clear descriptions, input/output schemas, validation logic, and integration with external APIs or command-based systems
Search, RAG, and prompting: Experience with semantic search, vector databases, RAG patterns, prompt engineering, and structured LLM outputs
Testing and evaluation: Experience creating golden queries, automated tests, regression checks, and chatbot/agent response evaluations, including LLM-as-a-judge approaches
Python engineering: Proven experience developing production-quality Python code, including automated tests and maintainable integration logic
Networking expertise: CCNA certificate or equivalent knowledge. Understanding of networking platforms, device commands, and troubleshooting
English (B2 level at minimum, but preferably C1 or C2)
Beyond the criteria above, we would appreciate the nice-to-haves:
Experience with AI-assisted coding tools such as Codex, GitHub Copilot, Cursor, or similar is a plus
MCP and agent interoperability: Familiarity with Model Context Protocol, MCP server design, tool discovery, tool permissions/scopes, and emerging agent-to-agent communication patterns such as A2A
Advanced agent architectures: Understanding of routing agents, supervisor/planner patterns, multi-agent workflows, guardrails, and architectures combining deterministic logic with LLM-based reasoning
LLM evaluation tooling: Experience with frameworks and platforms such as DeepEval, LangSmith, OpenAI Evals, TruLens, BenchLLM, or similar tools for evaluating LLM and agent workflows
AI/ML for infrastructure data: Practical knowledge of classification, clustering, anomaly detection, time-series analysis, or statistical methods applied to telemetry, syslog, events, alerts, or operational data
Production deployment and operations: Experience deploying AI/LLM-based solutions in production environments, including Docker, Kubernetes, CI/CD, monitoring, MLOps, or cloud/hybrid infrastructure
Interactive analysis and visualization: Experience building dashboards, notebooks, or lightweight applications for analysis and validation using tools such as Jupyter, Streamlit, Plotly, Altair, matplotlib, or similar
More reasons to join us
Flexible working hours and approach to work: fully remotely, in the office or hybrid
Professional growth supported by internal training sessions and a training budget
Solid onboarding with a hands-on approach to give you an easy start
A great atmosphere among professionals who are passionate about their work
The ability to change the project you work on
- Department
- Quality Assurance Division
- Role
- Quality Assurance - billable
- Locations
- World, Poland
- Remote status
- Fully Remote