This week at Fortune Brainstorm Tech, I sat down with leaders actually responsible for implementing AI at scale - Deloitte, Blackstone, Amex, Nike, Salesforce, and more. The headlines on AI adoption are usually surveys or arm-wavy anecdotes. The reality is far messier, far more technical, and - if you dig into details - full of patterns worth stealing. A few that stood out: (1) Problem > Platform AI adoption stalls when it’s framed as “we need more AI.” It works when scoped to a bounded business problem with measurable P&L impact. Deloitte's CTO admitted their first wave fizzled until they reframed around ROI-tied use cases. ➡️ Anchor every AI proposal in the metric you’ll move - not the model you’ll use. (2) Fix the Plumbing Every failed rollout traced back to weak foundations. American Express launched a knowledge assistant that collapsed under messy data - forcing a rebuild of their data layer. Painful, but it created cover to invest in infrastructure that lacked a flashy ROI. Today, thousands of travel counselors across 19 markets use AI daily - possible only because of that reset. ➡️ Treat data foundations as first-class citizens. If you’re still deferring middleware spend, AI will expose that gap brutally. (3) Centralize Governance, Decentralize Application Nike’s journey is a case study: Phase 1: centralized team → clean infra, no traction. Phase 2: federated into business-line teams → every project tied to outcomes → traction unlocked. The pattern is consistent: centralize standards, infra, and security; decentralize use-case development. If you only push from the top, you have a fast start but shallow impact. Only bottom-up ownership gives depth. ➡️ You can’t scale AI from a lab. It has to live where the business pain lives. (4) Humans are harder than the Tech Leaders agreed: the “AI story” is really a people story. Fear of job loss slows adoption. ➡️ Frame AI as augmentation, not replacement. Culture change is the real rollout plan. (5) Board Buy-In: Blessing and Burden Boards are terrified of being left behind. Upside: funding and prioritization. Downside: unrealistic timelines and a “go faster” drumbeat. Leaders who navigated best used board energy to unlock investment in cross-functional data/security initiatives. ➡️ Harness board FOMO as cover to fund the unsexy essentials. Don’t let it push you into AI theater. (6) Success ≠ Moonshot, Failure ≠ Fatal. - Blackstone's biggest win: micro-apps that save investors 1–2 hours/day. Not glamorous, but high ROI. - Nike's biggest miss: an immersive AI Olympic shoe designer - fun demo, no scale. Incremental productivity gains compound. Moonshots inspire headlines, but rarely deliver durable value. ➡️ Bank small wins. They build credibility and capacity for bigger bets. In enterprise AI, the model is the easy part. The hard part - and the difference between demo and value - is framing the right problem, building the data plumbing, designing the org, and bringing people along.
Challenges of AI Adoption
Explore top LinkedIn content from expert professionals.
-
-
🧠 Most GenAI apps today still operate like toys: One prompt in. One blackbox model. One fragile response out. But building reliable, production grade LLM systems requires a fundamental shift, not in model choice, but in how we engineer context. It’s about applying real software engineering to AI workflows. Here’s what that means in practice: 🔹 Context as the First-Class Citizen The biggest bottleneck isn’t the model. It’s irrelevant, bloated, or missing context. Engineering the right context through smart retrieval, memory, and filtering matters more than prompt hacking. 🔹 Agents Need Structure, Not Magic An agent isn't magic glue between a prompt and a tool. It’s a software component with inputs, outputs, error states, logs, retries, and fallback logic. Treat it like one. 🔹 Roles > Prompts Stop dumping every task on a single agent. Design specialized agents with clearly defined responsibilities: research, synthesis, decision-making, formatting. Let them collaborate. don’t overload one. 🔹 Observability is Non-Negotiable If you can’t trace why a task failed, what prompt was used, what data was passed, what the tool responded, then your agent is a black hole, not a product. 🔹 Structured Outputs. Always. Forget free-form text. Your agents should produce JSON, not poetry. Make outputs machine-parseable and testable, it’s the only way to scale automation. 🔹 Fail Gracefully LLMs will hallucinate. They will timeout. They will crash APIs. That’s not a bug, that’s reality. Build guardrails, retries, and failover logic like any other brittle service. This isn’t about AI hype. It’s about engineering maturity. We’re not prompt engineers anymore. We’re context engineers, agent architects, and AI systems designers. And that’s what will separate real products from fancy demos. We just open sourced our intra platform multi agent system. If you are interested to know more just comment "Agents" and I will DM you the link. Also here is a link to an Agenthon (agent Hackathon): https://bit.ly/4eeuMA6 Image by Lance Martin / LangChain
-
Signs You're Overcomplicating Your AI Solution Here are some clear warning signs that your "agent" should probably be just a simple AI workflow instead: ⚠️ Your Task Flow is Actually Static - You find yourself hardcoding most of the "agent decisions" - The same steps happen in the same order every time ➜A simple prompt chain would accomplish the same thing ⚠️ No Real Tool Decisions - Your "agent" is just calling the same tools in sequence - Tool selection could be handled by basic if/then logic ➜ You're building complex reasoning for simple routing decisions ⚠️ Forced Complexity - You're adding tools just to make it more "agent-like" - Simple tasks are being broken into unnecessary sub-steps ➜ A single LLM call could handle what you've split into five tools ⚠️ Framework Overload - You're spending more time learning agent frameworks than solving problems - Simple integrations require mountains of boilerplate code ➜ You've added three dependencies to do one basic task Remember: True agency makes sense when you need dynamic tool selection and reasoning on which step to do next. For everything else, stick to simple workflows. You'll get better results with less headache.
-
Many CFOs and FP&A Teams have reached out about how to get practical at implementing and integrating #AI into #finance and FP&A. Not the theory. Not the hype. Just: Where do we start? What’s real, useful, and relevant for our finance teams and to our planning cycle? Whenever I get that question, I usually start with the same thing: I ask a few simple but strategic questions about the business — how you forecast, where your data lives, what tools you use, where the pressure points are. That way, I can tailor advice that fits you, not just the latest trend. But I know it’s not always possible to hop on a call, dig deep, or bring in consultants. So I tried to recreate that process — in a way that scales. I created: AI for FP&A Reflection Cards A deck of 60 practical, no-fluff prompts to help you and your team: Understand where AI fits in your FP&A workflow Identify where you can save time or improve accuracy Build trust in AI-powered forecasts Ask better questions about data, assumptions, and insights Basically to bridge the gap between automation and strategy Here I leave you the first 10, tell me in the comments if they are useful and I'll post the rest next month. These cards are designed to help you: ✅ Integrate AI into your actual forecasting, budgeting & reporting ✅ Get hands-on with your data without needing to be a data scientist ✅ Lead better conversations with stakeholders and execs ✅ Build confidence in using AI before you commit to big tools or projects
-
One in three companies are planning to invest at least $25m in AI this year, but only a quarter are seeing ROI so far. Why? I recently sat down with Megan Poinski at Forbes to discuss Boston Consulting Group (BCG)'s AI Radar reporting, our findings, and my POV. Key takeaways below for those in a hurry. ;-) 1. Most of the companies have a data science team, a data engineering team, a center of excellence for automation, and an IT team; yet they’re not unlocking the value for three reasons: a. For many execs, the technologies that exist today weren't around during their school years 20 years ago. As silly as it is, but there was no iPhone and for sure no AI at scale deployed at people’s fingertips. b. It's not in the DNA of a lot of teams to rethink the processes around AI technologies, so the muscle has never really been built. This needs to be addressed and fast... c. A lot of companies have got used to 2-3% continuous improvement on an annual basis on efficiency and productivity. Now 20-50% is expected and required to drive big changes. 2. The 10-20-70 approach to AI deployment is crucial. Building new and refining existing algorithms is 10% of the effort, 20% is making sure the right data is in the right place at the right time and that underlying infrastructure is right. And 70% of the effort goes into rethinking and then changing the workflows. 3. The most successful companies approach AI and tech with a clear focus. Instead of getting stuck on finer details, they zero in on friction points and how to create an edge. They prioritize fewer, higher-impact use cases, treating them as long-term workflow transformations rather than short-term pilots. Concentrating on core business processes is where the most value lies in moving quickly to redesign workflows end-to-end and align incentives to drive real change. 4. The biggest barrier to AI adoption isn’t incompetence; it’s organizational silos and no clear mandate to drive change and own outcomes. Too often, data science teams build AI tools in isolation, without the influence to make an impact. When the tools reach the front lines, they go unused because business incentives haven’t changed. Successful companies break this cycle by embedding business leaders, data scientists, and tech teams into cross-functional squads with the authority to rethink workflows and incentives. They create regular forums for collaboration, make progress visible to leadership, and ensure AI adoption is actively managed not just expected to happen.
-
CFO Mistake number 1: not using AI Read further to discover the roadmap I have give to over 1,000 CFOs: (the roadmap below is based on many minutes of transcripts worth thousands of dollars of consultation) 1) Foundation First — “Crawl” Why: Before anything clever, make AI safe, accessible, and pointed at work data. How: - Pick the stack your company already trusts (Microsoft → Copilot 365; Google → Gemini) - Issue professional licenses (not just to managers) and turn on meeting transcripts - If you’re not on Microsoft, ChatGPT Teams gives SOC2-type protections and connectors (Dropbox/Outlook/etc.). 2) Quick Wins that Stick — “Walk” Why: Early, visible wins build trust and momentum. How: Start with 2–3 wins any team can repeat: Auto-summarize meetings and extract actions from transcripts Draft monthly review from call notes 3) Teach the Team, Not the Tool — “Organize” Why: Licenses don’t equal adoption; cadence does How: - Launch a Teams/Slack channel for daily use-cases + Q&A - Run bi-weekly “lunch & learn” show-and-tells. - After 6–8 weeks, hold a 2-hour hackathon focused on building agents/GPTs for real finance tasks 4) Projects vs. GPTs — “Systematize” Why: Consistency beats clever prompts. How: - Use Projects to hold client/company-specific files & context. - Use Custom GPTs/Agents for repeatable tasks across clients (cash/flux analysis, email templates). Then call the GPT inside each project for consistent behavior. 5) Data Foundations, Light — “Un-silo” Why: Most pain is messy, disconnected data; you don’t need a new ERP to start. How: - Stand up a light harmonization layer: use Zapier (fast) or N8N (flexible/self-hosted) to sync keys and push clean events between CRM, support, and accounting. - Apply fuzzy matching on org names/addresses; standardize a master table. 6) Automate the Repetitive — “Run” Why: Free up 20–40% of team time with safe, deterministic workflows How: - Use Power Automate/Office Scripts (Microsoft) to script repeatable flows: save attachments to SharePoint, transform Excel → PowerPoint, label emails, schedule reports. - Let AI write the scripts; humans review once. 7) Build Finance Agents — “Augment” Why: Agents replace the human triggering AI by an automatic triggering: “do the thing when X happens.” How: - Start with agents for commentary drafting, classification, vendor form filling, or monthly flux notes - Share agents internally so anyone can reuse and adapt - Tools: n8n, Copilot Studio or Zapier I have many other frameworks to share based on my transcripts. Let me know in the comments if you want me to continue to share them!
-
We Are Getting AI-RAN Wrong The story usually goes like this: take the hottest technology in the world, GPUs, and bolt them next to radios. Voilà: AI-RAN. But the RAN is not a data center. It’s a hard real-time control system where every NR slot, 0.25 to 1 ms depending on numerology, must be closed on time. Miss the window, and you don’t degrade gracefully, you lose the transmission. That’s why today’s distributed units are engineered like Swiss watches: rugged, deterministic, and built to last a decade in the field. A modern baseband such as Ericsson’s 6648 is a shoebox-sized 1U appliance, drawing 310 W in typical load (peaking at 340 W), delivering millisecond-scale timing in outdoor cabinets. Pricing is usually in the low tens of thousands, with air cooling and tight heat and power budgets carefully managed. Now set that against NVIDIA’s H100: 350–400 W in PCIe, up to 700 W in SXM, priced $30–40k, built for climate-controlled halls with advanced cooling.. Putting one at a macro site is like dropping a Ferrari engine onto a bicycle frame. Technically impressive, contextually absurd. The economics bend the same way. The global RAN market was about $35B in 2024, dominated by five vendors who live on scale and efficiency. NVIDIA’s FY2025 revenue was $130.5B. A single hyperscaler GPU order can be $10B or more. To NVIDIA, telco is a small change. To operators, obsessed with squeezing every watt and every dollar, a $40k, 400 W box per site is not a business case; it’s a non-starter. But the real opportunity is not about bolting GPUs into DUs. It’s about system redesign. 1. Silicon. Intelligence must be embedded, not attached, neural engines inside baseband SoCs, running PHY helpers like channel estimation or decoding with fixed latency at telecom-grade power. 2. Models. Train large in the cloud, then deploy distilled, quantized, sparsity-aware models that adapt compute to channel conditions and fall back to DSP when joules-per-bit demand it. 3. Placement. Match the model to the control loop: tiny nets alongside PHY in the DU; predictive schedulers in the CU or near-RT RIC (10–100 ms); heavy analytics and policy in the non-RT RIC (seconds to minutes). 4. Orchestration. The SMO/OSS must act as an air-traffic controller, power- and KPI-aware, shifting inference across DU, CU, and MEC based on load, grid price, and thermal headroom. Success is not demo throughput, it’s BLER reduction per watt, or kWh saved per sector. GPUs will remain superb for training and for selective edge inference. But the RAN that scales is one where AI is designed into the fabric: AI-aware basebands, compact models, loop-correct placement, and power-sensible orchestration. That’s not a Ferrari strapped to a bicycle. That’s a new bicycle built with intelligence in every spoke, cadence held, and balance kept. https://lnkd.in/gV4kJCuh
-
Balancing CPU and GPU Architectures for Network Edge AI in AI-RAN As AI workloads migrate from centralized cloud data centers to the network edge, selecting the right compute architecture for inference becomes a strategic decision. 🎓 The complementary roles of CPUs and GPUs at the edge, especially for SLM-based inference, multi-agent systems, and AI-RAN use cases like mMIMO beamforming: ✔️ Edge inference workloads are diverse. It means that no single hardware architecture fits all scenarios. 1️⃣ CPU-Based Inference at the Edge ✅ Strengths Good for lightweight SLMs : It can run lightweight AI/ML models without the idle overhead common with GPUs. Power Efficient: Modern CPUs optimized for inference consume less power, making them ideal for telecom base stations, smart gateways, and regional MEC nodes. Scalable Microservices Deployment: Well-suited for stateless, containerized AI workloads, such as retrieval-augmented generation (RAG) and vector search. 🚫 Limitations Not Suited for Heavy Matrix Ops: For AI workloads that require large matrix multiplications (e.g., vision models, large transformers), CPUs may lag. Limited Acceleration for mMIMO: Lacks the throughput and parallelism required for real-time beamforming and large-scale signal correlation. 2️⃣ GPU-Based Inference at the Edge ✅ Strengths High Throughput for Parallel Workloads: Excellent at handling transformer layers, image-based inference, and multi-modal inputs. Essential for Massive MIMO in AI-RAN: mMIMO processing involves real-time matrix decomposition, beamforming weight updates, and channel state estimation—tasks that benefit greatly from GPU acceleration. GPUs can efficiently execute compute-heavy AI/ML algorithms and AI-driven models for dynamic beamforming optimization. In 64T64R or higher mMIMO systems, fronthaul signal processing can exceed 20-40 Gbps—an area where GPUs shine with their parallelism and memory bandwidth. 🚫 Limitations Power & Cost Overhead: unsuitable for certain edge cabinets or small cell sites with tight thermal envelopes. Overkill for Lightweight Tasks: loghtweight SLMs running on GPUs may lead to inefficient resource use and increased cost-per-inference. 3️⃣ Hybrid Strategies in AI-RAN and Edge AI To support both agentic AI models and real-time RAN signal processing, hybrid edge platforms are emerging: SLMs and control agents run on AI-optimized CPUs. LLMs, Beamforming and AI-enhanced PHY processing are handled by edge GPUs. An AI orchestrator layer dynamically assigns workloads based on latency, compute availability, and model type. This hybrid computing platform such as Grace-Blackwell allows operators to meet both latency SLAs and efficiency goals while delivering advanced AI-RAN features like self-optimizing networks, intelligent mobility management, and edge AI/LLM inferencing. #AIatEdge #AIInference #SLM #AgenticAI #AIforRAN #MassiveMIMO #EdgeComputing #GPUs #CPUs #HybridAI #5G
-
AI content isn't building assets. It's renting traffic. This year, I've worked with four organisations facing the same crisis. They invested £15-50k in automated content generation, expecting to build a sustainable traffic engine. Instead, they've discovered something unsettling: AI content behaves more like PPC than owned media - stop investing and the traffic stops. This is typical of what I'm seeing of AI Content - → Decay rates of 15-30% per quarter as Google refines its understanding of thin AI content → 70% of articles generating zero traffic (not 20-30% like traditional content) → No compounding effects — unlike human-written evergreen content, AI articles don't build authority or attract natural links → Forced reinvestment — clients must keep producing just to maintain current traffic levels One client's data is particularly telling: • 485 blogs and landing pages published • 410 pages (84.5%) get zero traffic • £5,700/month traffic value from just 10 pages • Massive drops in late 2024 and 2025 suggesting algorithmic vulnerability The power law is working (10 pages drive most value), but it's unstable. Those winners can disappear overnight with algorithm updates. My Recommendations - Instead of "publish 6,000 AI articles," my clients are now: • Using AI to discover 400 topics through article testing (rather than focus on content for the sake of content). • Identifying the 10-20 winners from real traffic data • Invest £500-1,000 per winner to add defensibility (expertise, data, tools, links & FAQs) • Archive or consolidate the losing content AI excels at discovery and testing. But building moats still requires human expertise, original thinking, and genuine value. The question to ask: Are you building equity, or renting traffic that expires? What are you seeing in your content analytics? Are your 2024-era AI articles still performing, or showing decay?
-
LLMs have demonstrated exceptional performance across a wide range of tasks. However, their significant computational and memory requirements present challenges for efficient deployment and lead to increased energy consumption. It is estimated that training GPT-3 required 1,287 MWh, equivalent to the average annual energy consumption of 420 people! Recent research has focused on enhancing LLM inference efficiency through various techniques. To make an LLM efficient, there are 3 approaches: 𝟭. 𝗗𝗮𝘁𝗮-𝗟𝗲𝘃𝗲𝗹 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻𝘀 focus on optimizing input prompts and output content to reduce computational costs without modifying the model itself. Techniques like input compression and output organization can be used to achieve this. Input compression involves strategies such as prompt pruning and soft prompt-based compression, which shorten prompts and thus reduce memory and computational overhead. On the other hand, output organization methods, such as Skeleton-of-Thought (SoT) and Stochastic Gradient Descent (SGD), enable batch inference, improving hardware utilization and reducing overall generation latency. These approaches are cost-effective and relatively easy to implement. 𝟮. 𝗠𝗼𝗱𝗲𝗹-𝗟𝗲𝘃𝗲𝗹 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻𝘀 involve designing efficient model structures or compressing pre-trained models to enhance inference efficiency. This can be achieved through techniques such as efficient Feed-Forward Network (FFN) design, where approaches like Mixture-of-Experts (MoE) reduce computational costs while maintaining performance. These optimizations can be impactful in high-demand environments where maximizing performance while minimizing resource usage is critical, though they may require more significant changes to the model architecture and training processes. 𝟯. 𝗦𝘆𝘀𝘁𝗲𝗺-𝗟𝗲𝘃𝗲𝗹 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻𝘀 enhance efficiency by optimizing the inference engine or serving system without altering the model itself. Techniques like speculative decoding and offloading in the inference engine can improve latency and throughput by optimizing computational processes. Furthermore, serving system strategies such as advanced scheduling, batching, and memory management ensure efficient resource utilization, reducing latency and increasing throughput. These optimizations are particularly useful for large-scale deployments where the model serves many users simultaneously. They can be implemented at a relatively low cost compared to developing new models, making them a practical choice for improving the efficiency and scalability of existing AI systems. As these optimization techniques continue to evolve, they promise to further enhance the efficiency and scalability of LLMs, paving the way for even more advanced AI applications. What other innovative approaches can we expect to see in the quest for optimal AI performance?