Microsoft’s GitHub Copilot pursues the absolute ‘time to value’ of AI in programming

Microsoft Copilot

Maria Diaz/ZDNET

The matter of how much generative AI can help coders is in hot debate. ZDNET’s own David Gewirtz has found from his first-hand experiments that OpenAI’s ChatGPT “can write pretty good code.” At the same time, some studies have found large language models such as GPT-4 are well below those of human coders in their overall level of code quality.

Also: Pinecone’s CEO is on a quest to give AI something like knowledge

But the debate over whether AI does or doesn’t stack up as a coder may be missing the point, some argue. The essence of coding help via automation, they say, lies in changing the nature of a programmer’s job.

“If you ask me what is the big change, what’s happened with the world of generative AI is that we have created another abstraction layer on top of AI,” said Inbal Shani, chief product officer for GitHub, the developer site owned by Microsoft, in an interview recently with ZDNET. 

That abstraction layer, namely, natural language, has initially been used just for code completion. “That’s the basic layer that we’ve seen,” she said. The power of the abstraction layer, argues Shani, is that it can broaden out to many more uses of AI beyond code completion.

Also: How to use ChatGPT to write code

GitHub introduced its version of code assistance, GitHub Copilot, in June of 2021. This year has been “a transformational year” for AI in programming, said Shani. As Microsoft CEO Satya Nadella announced in October, GitHub has over a million paying customers using Copilot, and over 37,000 organizations using it.

Shani cited prominent Copilot users such as Accenture, which has put hundreds of developers on Copilot. “They’ve seen that there was a lot of usage to reduce what we call boilerplate code, the repetitive code that developers do not necessarily like to write, but have to because it’s part of their foundations.” 

Accenture has retained 88.5% of the code written by Copilot, said Shani. “So this means that copilot was able to provide a high accuracy — high-fidelity answers to their developers that they choose to keep that code and not need to rewrite it.”

By one measure of productivity, the number of pull requests completed on time, when new code is merged with the main source for a project, has increased by 15% as a result of using Copilot at Accenture. Moreover, “They’ve seen developers more apt to go through the build process,” the task of converting code into a running binary. 

Also: Microsoft has over a million paying Github Copilot users: CEO Nadella

“Sometimes, developers hold themselves back” from doing builds, she noted. “They say, I don’t trust, I need to test again, but using Copilot, it kind of helped build that trust to deploy more code into production.”

inbal-shani-chief-product-officer

Microsoft GitHub

The prospect of those little changes — more pull requests, more builds, less boilerplate code writing — have immediate qualitative benefits in the way the developer’s day changes.

“If we can increase the build rate in a consistent way, then that basically helps developers to spend less time waiting for builds, to have more time back to focus on architecture and so on,” said Shani. 

“A shocking discovery that happened for me is that developers have less than two hours a day to write code,” on average, said Shani. “They need to do many things that are around the software development lifecycle, but not around the coding — they do builds, they write tests, they sit in meetings, they need to engage with other folks, they need to write PRs [pull requests].”

Also: Can AI code? In baby steps only

By automating some of those tasks, or parts of them, there’s the prospect “we’re giving more bandwidth for developers to invest in the other areas.”

None of this is yet been thoroughly and rigorously quantified in terms of a productivity increase, conceded Shani. “I think we’re In the middle of that,” she said of the process of measuring productivity. Copilot and its ilk “have not been adopted for long enough for us to get real, substantial data that we can say, here’s how we’ve changed lives forever.”

Definitions are tricky for productivity, she noted. “You can write really crappy code really fast,” so, speeding up code via code completion is “not necessarily an indicator of success.” 

Rather, said Shani, “the work that we have ongoing is, What is really time to value? What is that impact? How do we measure the impact of these tools that we have been adopting along the way? That’s still ongoing.”

Another important element to measure somehow is “how to define developer happiness,” said Shani. “It’s very important for developers to be recognized, and right now, the recognition is coming in some companies from measuring how many lines of code am I writing.” But the verbosity of a programmer may not be the best indicator of how good a programmer is, she points out.

One of the more profound elements of the new abstraction layer taking shape in AI is a reduction in the need to switch between different tools. 

Also: AI aims to predict and fix developer coding errors before disaster strikes

“Usually, if I’m looking for something I don’t know how to write, I’ll go to some sort of search engine,” explained Shani. “Copilot was able to bring all of that into the same environment.” The interface, the prompt, “is right there in your IDE [integrated development environment],” so that “you don’t need to go to different tools, you don’t need to copy-paste, you don’t need to do all that; you basically stay where you write your code.”

As a result, she said, “Developers are happy because they have less context-switching between tools.”

Copilot is finding its way into other areas of the programming team. One big Copilot user, e-commerce firm Shopify, is using Copilot to do coding interviews, to assess new hires, said Shani. It’s also using Copilot for onboarding of new programmers, as a “peer programmer” or educator to bring new coders up to speed.

In those instances where Copilot and similar tools don’t yet produce the results one desires, a lot may be because of the learning curve of prompt engineering, said Shani. “You still need to know how to ask the right question,” she said. 

“The more you ask a broader question [at the prompt], the more general the solution you’ll get that is not necessarily applicable for your situation,” whereas, “the more you know how to ask the right questions, the better you get an answer from Copilot.”

Also: I tested Google Bard’s newest coding skills. It didn’t go well…again

Microsoft is working with customers such as Accenture on “that change management,” she said, of how to write a “proper prompt,” and “how to think about the question you ask Copilot to get the right answer that is applicable.”

There’s still a lot of fleshing out of Copilot itself that will likely have a major impact on its utility, and its accuracy. The program is gradually gaining the ability to become “personalized” for an individual developer. “An aspect we’re working on is how we can help these models to understand your coding style,” said Shani, “to understand which of these elements are critical for you as a software developer, to adjust the recommendations we give you.”

In February, GitHub will make generally available an enterprise version of Copilot. “This is specifically about more customized models for enterprises that want to have their own flavor of that implementation,” said Shani. 

Within the enterprise edition, “you’re going to have the ability to summarize PRs or add comments to the code using Copilot, or search your documents and get that document you’re looking for.” There will also be increased emphasis placed on Copilot’s handling of testing and stress testing.

Also: Bill Gates predicts a ‘massive technology boom’ from AI coming soon

The over-arching idea is to “centralize everything with the same kind of AI flow model,” said Shani, “across software development, from inception to production.”

Advanced Micro Devices, the chip maker, is one of the beta customers for the enterprise edition, specifically for fine-tuning AMD’s internal generative AI models. “We have a long waiting list of more customers that want to enter,” she noted. “We’re taking it through a lot of rigorous testing, and we want to get a lot of feedback from customers that are currently on our beta program before we feel confident to share.”

It may sound strange to speak of developer happiness, given that some have suggested automating code via AI can eliminate programming jobs. That’s not the case, however, insists Shani. “It’s not going to replace developers, not in the next, I would say, five, ten years,” she said. “I’m in the camp that says never, because we’re just going to evolve as developers.”

Shani, who before coming to GitHub a year ago ran the Elastic Containers product at Amazon AWS, has been working with AI for over two decades. She recalls her own personal journey as a coder from Fortran to C++ to Java to Python. “At every point in time, everyone was freaking out: oh, my God, this is going to take away the work of developers.”

Also: AI is growing into its role as a development and testing assistant

But, “We’ve seen more increase in developers because now we have lowered the barrier to be able to write more software.”

At the same time, the evolution of AI Copilots is “the same as the industrial revolution that lead to factories that scaled food production to meet demand,” as Shani sees it. “That’s what’s happening now: there’s more demand for software, so there’s more demand for software developers.” 

If code generation can be automated accurately, and if the abstraction layer can save on context switching, could Copilot and its ilk truly shorten the development time for projects? 

In the book The Mythical Man-Month, programmer Fred Brooks observed how simply adding resources to a large programming project not only did not speed up the project, a good deal of the time it actually made things worse.

It’s not yet clear if AI will dramatically help project scheduling and management or reduce the total effort required for a large programming project. 

“I don’t know if the concept of many months will turn to seconds,” said Shani. “Things will still take the right time to mature, but I think that the way to get there will be smoother and more efficient along the way if we can get to that value that we’re looking for in a shorter period of time.”

Source Link

LEAVE A REPLY

Please enter your comment!
Please enter your name here