Sagar Sharma - Tech Journalist at AIM

Why AI Can’t Get Software Testing Right

Sagar Sharma — Tue, 03 Sep 2024 10:31:39 +0000

Writing unit tests was already a headache for developers, and AI is making it worse. A recent study has unveiled a critical weakness in LLMs: their inability to create accurate unit tests.

While ChatGPT and Copilot demonstrated impressive capabilities in generating correct code for simple algorithms (success rates ranging from 63% to 89%), their performance dropped significantly when tasked with producing unit tests which are used to evaluate production code.

ChatGPT’s test correctness fell to a mere 38% for Java and 29% for Python, with Copilot showing only slightly better results at 50% and 39%, respectively.

According to a study published by GitLab in 2023, automated test generation is one of the top use cases for AI in software development, with 41% of respondents currently using it. However, this recent study is now questioning the quality of those tests.

A fullstack developer named Randy on Daily.dev forum mentioned that he had tried AI for both writing code and writing unit tests, and it failed miserably as it does not understand testing frameworks like Groovy and Spock.

If you practice TDD, LLMs shouldn’t be writing your tests (as good as it seems). You think through the use cases, break down the logic, and write the tests. Let the LLM fill in the implementation.

I wonder how far someone could get with this approach
— Andrew Nguonly (@andrewnguonly) June 2, 2024

Reason Why AI is Poor at Software Testing

AI-generated tests often lack the necessary context and understanding of specific requirements and nuances of a given codebase. Due to this, AI may result in an increase of “tautological testing” – tests that prove the code does what the code does rather than proving it’s doing what it’s supposed to do.

“It’s already a danger when you write the implementation first; AI is only going to make it worse,” a user explained in the Reddit discussion.

Moreover, relying on AI for test writing can lead to a false sense of security, as generated tests may not cover all critical scenarios, potentially compromising the software quality and reliability.

When an AI is asked to write unit tests for code that contains a bug, it typically doesn’t have the ability to identify that bug. Instead, it treats the existing code as the “correct” implementation and writes tests that validate the current behavior – including the bugs, if any.

Instead, the developer says that a better use for AI would be to ask it, “What are all the ways that this code can fail?” Instead of having it write tests, have it identify things you might have missed.

Another report by researchers from the University of Houston, suggested similar numbers as ChatGPT-3.5. Only 22.3% of generated tests were fully correct, and 62.3% were somewhat correct.

Besides, the report noted that LLMs struggle to understand and write OpenMP and MPI unit tests due to the inherent complexity and domain-specific nature of parallel programming. Also, when provided with “too much” context, LLMs tended to hallucinate, generating code with nonexistent types, methods, and other constructs.

“Like other LLM-based tools, the generated tests are a “best guess” and developers shouldn’t blindly trust them. In many cases, additional debugging and editing are required,” said Ruiguo Yang, the founder of TestScribe.

When developers consider making new test cases, AI still has a hard time doing that. With their creative problem-solving skills, human testers still need to make thorough test plans and define the overall testing scope.

But What is the Solution?

To solve this problem, researchers from the University of Houston used the LangChain memory method. They passed along smaller pieces of the code as a guide, allowing the system to fill in the rest, similar to how autocomplete works when you’re typing.

This proves that one of the most effective ways to tackle this problem is providing more context to the AI models, such as the full code or associated libraries, which significantly improves the compilation success rate. For instance, with ChatGPT, the increase was from 23.1% to 61.3%, and for Davinci, it was almost 80%.

In recent times, tools like Cursor are helping developers build code without any hassle and in future, we might see these tools building better unit tests along with production code.

But for now, while AI can generate tests quickly, having an experienced engineer will remain crucial to assess the quality and usability of AI-generated code or tests.

The post Why AI Can’t Get Software Testing Right appeared first on AIM.

Python Can’t Get Package Management Right

Sagar Sharma — Mon, 02 Sep 2024 05:41:19 +0000

The struggle is real. When a developer uses multiple package managers, there’s a risk of modules being overwritten or conflicting. Nearly 4.71% of packages on PyPI have module conflicts in their dependency graphs, leading to broken environments or even security risks.

Often, different package managers use different lock file formats, which can cause issues when switching between tools or collaborating with others using different package managers. In Python, things can get much worse when you consider dependency management.

In the Python world there are multiple package managers including pip, Conda, Poetry, pipenv, pyenv which seem to have their own flaws.

Why it matters? This makes it confusing for both new as well experienced developers, and eventually things feel unreasonably slow. Most users try to solve this by replicating other’s environment without giving it a second thought, and that does not work either.

But, Python is deaf for dependency resolution

One of the primary issues in Python dependency management is handling conflicting dependencies.

For instance, pip, the default package manager, cannot handle two different versions of the same package. This situation, often touted as “dependency hell”, causes entire installations to fail, leading to unexpected behaviour in projects.

A few months ago, one of the Reddit users mentioned that Python really feels like a programming environment from the early 80s, where a developer had a single project on their pc, and that was all they worked on for years.

“Python wants to do everything on the global system level, including runtime versioning and packages. That means that any two developers can think they have a working project on their system, even though they have radically different setups. This makes handing off and deploying Python applications a nightmare,” he added further, suggesting why dependency resolution is a nightmare on Python.

However, the most important and weird part of the dependency resolution is that pip makes assumptions. The pip documentation on dependency resolution explains that pip makes assumptions about package versions during installation and later checks these assumptions, which can lead to conflicts if the assumptions are incorrect.

Managing dependencies can be resource-heavy. One user reported having about 100+ GB of their hard drive filled with Python virtual dependencies, highlighting the storage impact of multiple environments.

Ergo, Virtual Environments

“I’m afraid of having 2000 folders, each one with a different virtual environment,” said one Reddit user expressing confusion about virtual environments. Running a project solely or in isolation becomes cumbersome.

While virtual environments are essential for project isolation and dependency management, there are instances where users find virtual environments problematic rather than solving the problem.

Previously, users have reported that package versions and dependencies can still conflict within virtual environments, requiring manual resolution in some cases that directly question the isolation in Python.

Some developers view virtual environments as wasteful, believing they unnecessarily duplicate libraries for each project. As one Reddit user stated, “It seems like you’re installing a new copy of every library every time you start a new project, which seems like a waste of resources.”

The complexity of virtual environments can be overwhelming for those new to Python. A Reddit user expressed extreme frustration, saying, “I spend way more time just trying my computer to get my virtual environment up, project dependencies installed, and IDE configured than I do actually coding.”

Several developers recommend using Docker to avoid virtual environment issues altogether. This approach encapsulates the entire environment, making it more reproducible across different systems.

The post Python Can’t Get Package Management Right appeared first on AIM.

Meet Bython, the Python With Braces

Sagar Sharma — Fri, 30 Aug 2024 08:45:57 +0000

Python has long been celebrated for its simplicity and readability. However, for developers coming from languages like C++ or Java, this syntax can sometimes feel unfamiliar or challenging. This problem served as an inspiration for Bython, funnily touted as the Python with braces.

Bython is a Python preprocessor which aims to provide an alternative syntax for Python that uses curly braces to define code blocks, similar to languages like C++ or Java, while still leveraging Python’s interpreter and ecosystem.

But it does not end there. The main aspect of Bython is you don’t have to worry about indentation. It won’t give you errors even if you mess up tabs/spaces or copy one piece of code to another that uses a different indentation style; it won’t break.

The first actual improvement to python I've seen in years.
— Alexmitter (@Alexmitter) August 29, 2024

Javascript for Python?

Python usually gets a lot of hate for whitespaces as when you accidentally mix tabs and spaces, it gives you an error which is hard to find. A Reddit user mentioned that he loves brackets as it does not break while copying. “I spend more time tabbing than I do putting two brackets and auto formatting,” he added further.

The key thing to note here is that Bython uses Python for interpretation, meaning existing Python modules like NumPy and Matplotlib still work seamlessly.

Jagrit Gumber, a frontend developer on X, while praising Bython, mentioned that Bython is really worth it if you come from a C C++, C# or Java background.

Some developers prefer using braces in code because it allows them to freely cut, paste, and rearrange code sections, relying on automatic formatting tools to correctly align and indent the code afterwards, regardless of how it was initially typed.

It is also related with muscle memory of programmers who are coming from other programing languages. When you spend enough time with one programing language, you develop a habit of using curly braces and most programming languages do use braces. A developer on Hackernews mentioned that he had developed a muscle memory of using braces by pressing Ctrl + M and he was able to parse code much faster when it is both indented and with proper brackets.

Bython also allows developers to write code that is both easy to read and write, like Python, while also being able to leverage the efficiency and speed of C. “Bython provides easy ways to call C functions and leverage C libraries, more control over hardware, and performance optimisations,” said Saikat Sinha Ray while explaining how Bython can be used to call C functions.

Some users loved Bython so much that they wanted all its features to be integrated into Python itself. A user on Hackernews mentioned said, “This shouldn’t be “Bython”, it should be Python 4.0,” suggesting the next iteration of Python 4.0 should have Bython baked into it.

Will Bython Replace Python?

While Bython solves some key issues which bothers multiples users, it can not replace Python by any means. Bython is more like a passion project from Mathias Lohne and it is not meant to be integrated into the Python project as not everyone finds the whitespaces an issue.

There are users who want to use Python and have no issues with its syntax. There are even users who hate the idea of using braces and find the syntax of Python better than ever. So we would suggest to think of Bython as an extension of Python which is optional and can be used if you have issues with braces and whitespaces.

The post Meet Bython, the Python With Braces appeared first on AIM.

How is Linux Powering the AI Moment?

Sagar Sharma — Thu, 29 Aug 2024 11:38:55 +0000

A few months ago, NVIDIA open sourced their drivers, starting with R515 for Linux. This solves one big problem of getting up and running a Linux system powered by an NVIDIA card. However, with users requesting this for years, what made NVIDIA finally open source them?

The answer is simple. Most developers use Linux and drivers are the most stressful part of using the NVIDIA GPU.

Not just developers, you will even find many big tech companies, including OpenAI and Google, rely on Linux to build their AI platform.

TensorFlow, one of the most popular AI development libraries by Google, is best compatible with Ubuntu, a well-known Linux distribution. On Windows, however, you’ll need to use WSL (Windows Subsystem for Linux), which essentially acts as a virtual machine for running Linux.

Even Google DeepMind, one of the leading AI research labs, previously used a modified version of Ubuntu, called Goobuntu, but later switched to Debian testing. These are both Linux-based operating systems.

Further, IBM’s Watson, known for its natural language processing and machine learning capabilities, runs on SUSE Linux Enterprise Server.

CUDA Performs Better on Linux

It might come as a surprise, but many users have reported that CUDA performs better on Linux than Windows.

Mike Mikowski, the manager of the Kubuntu Focus project, mentioned that NVIDIA GPU drivers are almost not faster on Windows, especially for deep learning solutions which use CUDA and OpenCL interfaces.

NVIDIA has been using Ubuntu exclusively to demonstrate deep learning on all their edge solutions, which suggests Linux performs better for deep learning tasks compared to Windows.

Meanwhile, a Reddit user reported that when he used CUDA on the same hardware but with different operating systems, including Windows and Ubuntu, the latter performed better.

A reason behind this performance gain is Linux has a better GPU command scheduling than Windows.

Linux Makes AI Dev Environment Setup a Breeze

Despite the fact that Linux has a limited choice of software, you will find that a majority of AI developers use Linux. The key reason behind this is how well the software libraries are available and work well with Linux.

For example, getting CUDA and CuDNN up and running is a hassle and not a seamless process on Windows. Compared to that, on Linux, everything can be managed via package manager without any issues.

This is one of the reasons why most software development libraries and tools make their way to Linux first.

A Reddit user, while answering why everyone uses Linux, mentioned that installing and setting up development software on Linux is a breeze thanks to a healthy array of package managers.

“On Ubuntu or Debian-based systems, apt has most of what one will need to set a machine up. That, coupled with Conda, pip, and perhaps some additional software… it’s fast to install new libraries and programs, and dependencies are handled automatically,” he added further.

Linux being a versatile operating system matches the target production system in most cases as most production workloads are based on Linux. A user on Stack Overflow mentioned that CUDA performs better on Linux, stating, “If you are looking at a performance point of view, and the time taken for a build, it would be best to use Linux.”

When you combine better performance with CUDA and the ease of configuring the development environment, that too with other benefits like security, large community and support, there’s no doubt that Linux is powering the AI moment.

The post How is Linux Powering the AI Moment? appeared first on AIM.

Why are Devs Turning to TypeScript for AI Development?

Sagar Sharma — Tue, 27 Aug 2024 07:33:55 +0000

Web application developers, who want to use LLMs, naturally turn to TypeScript.

TypeScript, as a programming language, is a great fit when it comes to creating fast, user-friendly AI applications. Its asynchronous programming features let multiple tasks run concurrently, which is key when dealing with potentially slow AI model calls. This means that the application can stay responsive even if AI is working in the background.

As a result, there’s been a lot of discussion around whether TypeScript will replace JavaScript for AI development.

https://twitter.com/gethackteam/status/1818021430035873818

Why TypeScript Matters

One of the key reasons why developers are turning to TypeScript for AI is its ability to catch errors at compile-time rather than run-time.

With complex AI algorithms and large datasets, identifying and fixing errors early in the development process is crucial. TypeScript’s static typing helps ensure code quality and reduces the likelihood of bugs slipping through to production.

A recent JetBrains survey reveals that TypeScript is rapidly gaining adoption, with 28% of developers now using it regularly. This growing popularity can be attributed to the language’s ability to improve the overall developer experience and increase confidence in the code.

Harrison Chase, the co-founder and CEO of LangChain, highlighted the growing trend of AI development happening in TypeScript. “We are seeing a lot of AI development happen in TypeScript. Evaluation is a CRITICAL part of AI development! Being able to evaluate gives you the confidence to move fast,” he added.

While sharing a post-mortem report on previous bugs and issues at JSConf Hawaii, 2019, Brie Bunge, a senior staff software engineer at Airbnb, confirmed that 38% of bugs at Airbnb could have been prevented using TypeScript.

Apparently, TypeScript’s compatibility with popular AI libraries like TensorFlow.js and Brain.js enables developers to leverage existing JavaScript tools and frameworks while benefiting from TypeScript’s type safety and enhanced developer experience.

This compatibility allows for a seamless integration with the vast JavaScript ecosystem.

Moreover, the introduction of TypeScript-first AI development frameworks like TypeAI and Axilla.io are good examples of the community’s commitment to making TypeScript a first-class citizen in the AI ecosystem.

These tools provide developers with the necessary abstractions and utilities to build AI applications more efficiently and with fewer errors.

Apart from AI development, TypeScript, being a simple language, can also help you understand concepts. “TypeScript made me understand what I’m actually doing with every function, variable, before writing it… Also, it’s easier to understand object structure and just know more about certain objects,” a Reddit user said.

Can it Replace Python?

Short answer: No. Python can not be replaced with any programming language for AI and ML development – at least for now.

“Python has a lot of use cases around data vis, ML/DS, CI/CD, general scripting, and other things, some of which TS/JS just isn’t really used for currently,” said a Reddit user, suggesting a large ecosystem for data science and machine learning tasks.

We can’t deny Python’s dominance in AI, but there are users who have used both and prefer TypeScript.

A Reddit user in favour of TypeScript said, “Most folks here are pretty adamant that Python is the only way. I spent years coding in Python (mostly Django API servers and some Pytorch) and have now spent a couple of years coding in TS, including building RAG backends. I dramatically prefer TS for a few reasons: Package management with npm is much better…The TS type system is much better than Python types and more widely supported by packages.”

Ultimately, while Python currently remains the dominant language for AI development, TypeScript is gaining traction and offers a compelling alternative for certain use cases. So it will be interesting how the TypeScript community in future will make efforts to make it more relevant for AI development.

The post Why are Devs Turning to TypeScript for AI Development? appeared first on AIM.

The Persistent Flaw in Image Models: Time

Sagar Sharma — Mon, 26 Aug 2024 12:34:28 +0000

It’s a bad time for AI image generators – literally. Most notable AI models, including the likes of DALL.E 3, Midjourney, Stable Diffusion, and Ideogram 2.0, are all struggling to generate analog clocks, often defaulting to 10:10.

When AIM tested DALL.E 3 with a simple prompt asking it to generate the image of an analog clock showing the time as 6:30, both of the images it produced were timed at 10:10.

We even tried the recently released FLUX.1 AI Pro model with the same prompt and the results were similar to what DALL·E 3 produced.

What Explains This Fixation?

At the root of this issue lie the datasets used to train these AI image models.

Most of the clock and watch images used to train these models come from product photos and advertisements. As a strategy, clocks are almost always set to 10:10 as this V-shaped arrangement keeps the hands from obscuring logos usually placed at the 12:00 position and creates an aesthetically pleasing “smiling” face.

Now, since AI models learn patterns from training data, this overrepresentation of the 10:10 configuration gets baked in. The AI doesn’t understand the actual meaning or mechanics of clock hands – it simply learns from the statistical pattern that clocks should look like this.

Unfortunately, this issue goes beyond generating images of analog clocks. The deeper problem is that AI doesn’t understand the concept of time.

Typical 10:10 clock problem that haunts all image generators, tells us that these models are not learning general principles but rather pixel distributions https://t.co/sSk6OvwH6i pic.twitter.com/NZ273wknXs
— Nirmal Patel (@fromnirmal) August 15, 2024

Tex-to-Image Models Don’t Get Time

A Reddit user pointed out that AI knows no concept of time. “It is basically a mathematical function where for a certain input a corresponding output is calculated. There is no timing mechanism whatsoever in that path,” he said, adding that LLMs can only derive this conclusion from their training data and this is how LLMs work.

A Medium user reported how LLMs like ChatGPT are essentially “blacked out” between prompts, unable to maintain a continuous memory of events or experiences. This limitation prevents AI from forming a coherent understanding of the passage of time, as it lacks the ability to perceive and process time-related cues in the same way as humans do.

Source: Medium

Sure, this problem can be solved by using a custom prompt where it will ask about time to an internal clock but then differentiating between real time and hypothetical time will be another challenge and that is the reason why ChatGPT has not integrated this feature.

AI’s inability to understand the concept of time also stems from its absence of real-world experiences. The development of a comprehensive understanding of time requires a gradual process of learning through experience and problem-solving.

The Reasoning Part

AI systems, including the advanced ones, are not considered to be conscious and that explains why AI can’t reason. They process information but lack self-awareness, feelings, and intentions that are central to human consciousness. This lack of consciousness limits their ability to reason flexibly and understand context deeply.

Explaining AI’s inability to reason, Subbarao Kambhampati, professor at Arizona State University, said, “The tricky part about reasoning is, if you ask me a question that requires reasoning and I gave an answer to you, on the face of it, you can never tell whether I memorised it.”

A Medium post by James Gondola suggests that AI systems lack their own moral reasoning, which may lead to biases or overlook the subtle ethical factors that human judgement naturally considers.

There are techniques like Chain-of-Thought (CoT) prompting where you prompt the model to break down its reasoning process into a series of intermediate steps. This way, you can clearly understand how the model reached that specific output.

MIT researchers have proposed a neuro-symbolic AI technique that enables LLMs to better solve natural language, maths, and data analysis problems. It has two key components: neural networks that process and generate natural language and symbolic systems that perform logical reasoning and manipulation of symbols.

By combining these two, neuro-symbolic AI can leverage the strengths of deep learning for handling unstructured data while also incorporating the reasoning capabilities of symbolic systems.

The post The Persistent Flaw in Image Models: Time appeared first on AIM.

What is Stopping Devs from Building an LLM?

Sagar Sharma — Sat, 24 Aug 2024 04:30:00 +0000

Unlike typical software development, LLM development is a distinctly different and more complex task, with its own unique set of challenges. One of the most formidable challenges faced by LLM developers is the “curse of multilinguality”.

Sara Hooker, VP of research at Cohere AI, said, “When you try and make AI actually work for the world, you’re talking about this vast array of different languages. There are 7,000 languages in the world, and 80% of those have no text data.”

This lack of diverse language data leads to models that overfit high-resource languages like English and Chinese while under-serving the “longtail” of low-resource languages.

It doesn’t stop at that. It gets worse for the reasoning part.

The Elusive Nature of Reasoning

As Subbarao Kambhampati, professor at Arizona State University, illustrates with the classic “manhole cover” interview question, “The tricky part about reasoning is if you ask me a question that requires reasoning and I gave an answer to you, on the face of it, you can never tell whether I memorised the answer and gave it to you or I actually reasoned from first principles.”

Assessing whether an LLM can truly reason rather than just match patterns is difficult. There is often a gap between an LLM’s ability to generate code or text that looks plausible versus its deeper understanding of the underlying logic and ability to reason about it.

Natural language relies heavily on context, shared understanding, and inference to convey meaning. This makes it difficult for LLMs to extract precise semantics and formal logic needed for rigorous reasoning just from language examples.

Furthermore, LLMs have no concept of reality outside of language and cannot test the truth of statements. They are unconcerned about whether concepts contradict each other and only focus on generating sentences that follow language rules.

David Ferrucci, the founder of Elemental Cognition, argues that natural language is insufficient for reliable logical reasoning and computations in complex domains. He states that “for complex reasoning problems where you cannot afford to be wrong, natural language is not the right medium”.

“Without any underlying formalism, natural language’s ambiguity and subjectivity are great for casually navigating around into another human’s brain, but not the best for ensuring shared meaning and precise, reliable outcomes,” he added.

Ferrucci suggests that formal languages and reasoning systems are needed to enable complex problem-solving.

The Verification Gap

Perhaps the most critical challenge that LLM developers face is the lack of robust methods for verifying the outputs of these models. As Kambhampati notes, “It’s very hard to show what is and what is not on the web,” making it difficult to determine whether an LLM’s output is grounded in factual knowledge or mere hallucination.

A research paper titled ‘TrustLLM: Trustworthiness in Large Language Models’ developed a trustworthiness evaluation framework examining 16 mainstream LLMs across eight dimensions, including fairness, machine ethics, privacy, robustness, safety, truthfulness, accountability, and transparency.

The researchers found that none of the tested models was truly trustworthy according to their benchmarks, highlighting the need for improved verification methods.

Aidan Gomez, the CEO of Cohere, mentioned that to improve reasoning, language models need to be shown how to break down tasks at a low level, think through problems step-by-step, and have an “inner monologue”.

“However, data demonstrating this type of reasoning process is extremely scarce on the internet,” he added.

One of the most significant challenges in verifying the outputs of LLMs is their inherent “black box” nature. LLMs are complex, opaque systems that make it difficult for developers and researchers to understand how they arrive at their outputs.

LLMs suffer from a lack of interpretability, which means it is challenging to understand the reasoning behind their responses. This opacity makes it difficult to identify the root causes of incorrect or inconsistent outputs, hindering efforts to improve the models’ reliability.

Another related issue is the limited explainability of LLMs. Even when an LLM provides an answer, it is often unclear how it arrived at that particular response. This lack of explainability makes it challenging for developers to troubleshoot issues and refine the models.

Addressing the challenges faced by LLM developers will require a multifaceted approach. This includes developing more advanced verification methods to assess the factual accuracy and logical consistency of LLM outputs, improving the interpretability and explainability of LLMs to better understand their inner workings.

By focusing on these key areas, researchers and developers can work towards creating LLMs that are more reliable, trustworthy, and capable of complex reasoning across diverse languages and domains.

The post What is Stopping Devs from Building an LLM? appeared first on AIM.

It’s Disturbing How OpenAI’s DALL·E, FLUX.1, and Other Image Generators Can’t Grasp Time, Always Stuck at 10:10

Sagar Sharma — Thu, 22 Aug 2024 12:13:04 +0000

No matter how hard you try, your text-to-image models including your favorite OpenAI’s DALL·E 3 won’t be able to generate correct picture of an Analog watch showing the correct time. When AIM tried generating image of an analog clock showing 6:30 on DALL·E, it gave us the following output:

(Output from DALL·E)

AIM tried doing the same with FLUX.1 AI pro but results were pretty much the same:

(Output from FLUX.1 Pro)

AIM also tried the same prompt with Ideogram, which recently launched Ideogram 2.0, but it also struggled to show the correct time and was stuck at 10:10.

(Output from Ideogram)

Sure, if you prompt further, the time will change to something else but in our testing, it never showed a correct time which was given in the prompt.

Super scary, right?

This phenomenon is likely due to the training data used by these models. Many stock photos and product images of analog clocks are set to 10:10, as this configuration is considered aesthetically pleasing. As a result, the AI models learn to associate the concept of an analog clock with the 10:10 arrangement.

Furthermore, the issue highlights a fundamental limitation of current AI systems: the inability to reason or think critically. Rather than understanding the components and hierarchy of a clock (e.g., hour, minute, and second hands), the models simply reproduce patterns observed in their training data.

The success of AI applications heavily relies on the similarity between the training data and the expected test data. Without the ability to reason or generalize to novel scenarios, AI models may struggle to perform reliably in real-world situations where the input data differs from the training examples.

The post It’s Disturbing How OpenAI’s DALL·E, FLUX.1, and Other Image Generators Can’t Grasp Time, Always Stuck at 10:10 appeared first on AIM.

Even Your Graph Neural Networks Need Attention

Sagar Sharma — Thu, 22 Aug 2024 06:15:00 +0000

While graph neural networks (GNNs) have long been the preferred choice for graph-based learning tasks, they often rely on intricate, custom-designed message-passing mechanisms.

However, a novel approach has emerged that offers comparable or superior performance with a more streamlined methodology.

Enter masked attention for graphs (MAG), an innovative technique that reimagines graph representation. MAG conceptualises graphs as collections of nodes or edges, maintaining their interconnectedness through clever masking of the attention weight matrix.

This simple solution has proven remarkably effective, surpassing not only robust GNN benchmarks but also more sophisticated attention-based approaches across a diverse array of over 55 node and graph-level challenges.

The simplicity of MAG, which relies only on attention without any message passing, is a key finding. This suggests that the carefully designed message passing operators in GNNs can be effectively replaced by attention mechanisms. MAG's superior performance across diverse tasks… pic.twitter.com/jWxdx0V52L
— BensenHsu (@BensenHsu) July 24, 2024

Overcoming Traditional Problems

Junaid Ahmed, a software engineer, mentioned in a recent post on LinkedIn that there are certain limitations of transitional GGNs, including complexity in design, scalability issues, and performance variability, and they can be solved through MAG.

Apparently, MAG does not use any positional encodings, which is in contrast to current trends for GNNs and Transformers. It also does not require sophisticated learning rate schedulers or optimisers.

A professor of Beijing University of Posts and Telecommunications who goes by fly51fly on X mentioned MAG scales sub-linearly in memory and time with graph size. “It also enables effective transfer learning through pre-training and fine-tuning,” he said.

Another research paper titled ‘Masked Graph Transformer for Large-Scale Recommendation’ proposes using a Masked Graph Transformer (MGFormer) architecture for efficiently capturing all-pair interactions among nodes in large-scale recommendation systems.

The research paper further mentioned that the MGFormer outperforms GNN-based models, especially for lower-degree nodes, indicating its proficiency in capturing long-range dependencies for sparse recommendations.

Even with a single attention layer, the MGFormer achieves superior performance compared to baselines.

Expanding the Horizons of Graphs via Masking

In the field of cybersecurity, the cybersecurity entity alignment via masked graph attention networks (CEAM) model leverages an asymmetric masked aggregation mechanism to address the unique challenges of aligning security entities across different data sources.

By selectively propagating specific attributes between entities, CEAM significantly outperforms state-of-the-art entity alignment methods on cybersecurity-domain datasets.

Masking has also found its way into video event recognition through the masked feature modelling (MFM) approach.

MFM utilises a pre-trained visual tokeniser to reconstruct masked features of objects within a video, enabling the unsupervised pre-training of a graph attention network (GAT) block.

When incorporated into a state-of-the-art bottom-up supervised video-event recognition architecture, the pre-trained GAT block improves the model’s starting point and overall accuracy.

In the domain of temporal knowledge graph reasoning, the Attention Masking-based Contrastive Event Network (AMCEN) employs historical and non-historical attention mask vectors to control the attention bias towards historical and non-historical entities.

By separating the exploration of these entity types, AMCEN alleviates the imbalance between new and recurring events in datasets, leading to more accurate predictions of future events.

Overall, the emergence of MAG and related masked attention-based techniques represents an exciting new direction in graph representation learning, offering both simplified architectures and state-of-the-art performance across a wide range of applications like cybersecurity, reducing biases and video event recognition.

The post Even Your Graph Neural Networks Need Attention appeared first on AIM.

Why is Elon Musk’s Grok2 using FLUX.1 AI?

Sagar Sharma — Wed, 21 Aug 2024 05:24:14 +0000

FLUX.1 AI, developed by Black Forest Labs, a German artificial intelligence startup, is quickly proving itself superior to the competition in both image quality and creative flexibility. The startup has also collaborated with GrokAI and is experimenting with their FLUX.1 model to expand Grok’s capabilities on X.

Unlike other popular AI image generators like DALL-E and Midjourney, FLUX.1 appears to have minimal content filters or restrictions. This aligns with Elon Musk’s stated goal of pushing back against what he sees as over-censorship of AI platforms.

This way, Grok 2 users will now allow users to create uncensored deepfakes, like one where the 2024 US presidential candidates Kamala Harris and Donald Trump pose as a couple.

Apart from fewer restrictions and filters, FLUX.1 AI competes with popular models like Midjourney, DALL-E, Stable Diffusion, etc. and beats all of them in-terms of ELO scores.

(FLUX.1 AI gets more ELO scores than popular models. Source: Medium)

A key differentiator of FLUX.1 is its ability to accurately render human hands and legs, an area where previous AI image models struggled due to inadequacies in training datasets.

FLUX.1 also supports a diverse range of aspect ratios and resolutions up to 2.0 megapixels.

The Fusion of Multiple Architectures

What sets FLUX.1 AI apart from other models is its unique mixture of multiple architectures, allowing it to achieve superior results. At the core, FLUX.1 AI combines the strengths of both transformers and diffusion models. This powerful blend allows FLUX.1 AI to generate images with unprecedented speed and quality.

Transformers that work on neural networks excel at understanding and processing sequential data, such as text. They help FLUX.1 AI interpret and accurately translate text prompts into visual representations. Diffusion models, on the other hand, are skilled at generating high-quality images by iteratively refining noise into coherent structures. FLUX.1 AI leverages diffusion techniques to create images with intricate details and realistic textures.

Another great feature of FLUX.1 AI is prompt adherence and quality. Whether you use simple or complex prompts, the model delivers high-quality images that closely match the input description. A Medium user named LM Po used the prompt “a cat looking into a camera, point of view fisheye lens” and it gave results that can be compared to Midjourney V6.

(FLUX.1’s output after “a cat looking into a camera, point of view fisheye lens” prompt)

FLUX.1 AI Challenges Others Open Source Models

Generated by FLUX.1 AI

When you compare the best open source text-to-image models, you are left with limited choices including FLUX.1 AI and Stable Diffusion. However, compared to Stable Diffusion, FLUX.1 AI is quite easy to prompt.

In head-to-head comparisons, FLUX.1 AI consistently outperforms Stable Diffusion in generating photorealistic images with lifelike details and textures.

Because of its open-source nature, there are multiple forks available for FLUX.1 AI. One good example is Marketing Assistant App based on FLUX.1 [Schnell], allowing users to create social media content, marketing advertisements, and more for free.

Pierrick Chevallier, an AI designer on X, said that his experience with FLUX.1 AI was amazing. “Plus, its text handling is way better than SD3 and Midjourney,” he added, further praising model.

Users across multiple social media platforms are now blending FLUX.AI with other tools like Midjourney, Udio and Luma AI to create videos with transitions that would have required hours of work.

Surprisingly, Black Forest Labs, the company behind the new open-source AI tool consists of former Stability AI employees. These are the same minds behind tools like Stable Diffusion, Latent Diffusion, and Stable Diffusion XL.

In short, move over Stable Diffusion, Midjourney, Imagen and DALL-E, the new AI image generation champion in town is here to stay.

The post Why is Elon Musk’s Grok2 using FLUX.1 AI? appeared first on AIM.

Why Isn’t There a Delete or Undo Button in LLMs?

Sagar Sharma — Thu, 15 Aug 2024 06:10:09 +0000

“Where’s the delete and undo button in LLM?” asked Anshu Sharma, co-founder and CEO of Skyflow, who was introducing the concept of LLM vaults in a recent interaction with AIM.

“LLM vaults are built on top of the proprietary detect engine that detects sensitive data from the training datasets used to build LLMs, ensuring that this data is not inadvertently included in the models themselves,” he added, saying the company has built a proprietary algorithm to detect sensitive information in unstructured data that is being stored in the vault.

The Need for LLM Vault

Sharma has a stronger reason to believe so. “While storing data in the cloud with encryption will safeguard your data from obvious risks, in reality, we need layers. The same data can not be given to everyone. Instead, you can have an LLM vault that can identify sensitive data while inference and only share non-sensitive versions of the information with LLM,” said Sharma, suggesting why the LLM vault matters.

Source: StackOverflow

The vice president of Amazon Web Services, Jeff Barr, also mentioned that “The vault protects PII with support for use cases that span analytics, marketing, support, AI/ML, and so forth. For example, you can use it to redact sensitive data before passing it to an LLM”.

Gokul Ramrajan, a tech investor, explained the importance of LLM vaults, saying, “If you think protecting private data was hard with databases, LLMs make it even harder. “No rows, no columns, no delete.” What is needed is a data privacy vault to protect PII, one that polymorphically encrypts and tokenises sensitive data before passing it to a LLM”.

A few weeks ago, when Slack started training on user data, Sama Carlos Samame, the co-founder of BoxyHQ, raised a similar concern for organisations that are using AI tools and why they should have LLM vaults to safeguard their sensitive data.

Going Beyond LLM Vault

The likes of OpenAI, Anthropic and Cohere are also coming up with innovative methods and features to handle the data of a user and enterprise. For instance, if you are using OpenAI API, then your data won’t be used to train their model. Also, you can opt out of data sharing to ChatGPT. Privacy options like these somewhat eliminated the need for LLM Vaults.

Anthropic, on the other hand, have also incorporated strict policies on how they use user data to train their model and unless a user volunteers to do so or a specific scenario comes in where they collect user data.

Meanwhile, Cohere has collaborated with AI security company Lakera to protect against LLM data leakage by defining new LLM security standards. Together, it has created the LLM Security Playbook and the Prompt Injection Attacks Cheatsheet to address prevalent LLM cybersecurity threats.

There are other techniques like Fully Homomorphic Encryption (FHE) which allows computations to be performed directly on encrypted data without the need to decrypt it first. This means the data remains encrypted throughout the entire computation process, and the result is also encrypted.

The post Why Isn’t There a Delete or Undo Button in LLMs? appeared first on AIM.

Generative AI is Complicating the Process of Note-Taking

Sagar Sharma — Thu, 08 Aug 2024 12:53:51 +0000

Recently, Zoom introduced its GenAI-powered Zoom Docs, which aims to boost efficiency by converting information from Zoom meetings into actionable documents and knowledge bases. Smita Hashim, the chief product officer at Zoom, emphasised its time-saving potential, stating, “Zoom Docs is purpose-built to empower people to ‘work happy’ and give them more time back in their day.”

However, many users still want to stick to the traditional method of note-taking. A Reddit user mentioned: “Writing is thinking. Write your notes so you know what you think about them. You’re only shortcutting yourself when you use AI.”

Another user explained that he uses LLMs to analyse data but still makes notes manually. “My machine is too low on RAM to have a local LLM,” he added, further suggesting that running LLMs locally would be a better choice but he is limited by system resources.

But users have different opinions for tools like Obsidian, which allows you to use LLM locally through a plugin called Your Smart Second Brain. It even allows you to use your notes as a database and give you better insights and can be disabled to read your notes.

Andrej Karpathy praised Obsidian for being simple and having no vendor lock-in features.

Love letter to @obsdmd to which I very happily switched to for my personal notes. My primary interest in Obsidian is not even for note taking specifically, it is that Obsidian is around the state of the art of a philosophy of software and what it could be.

– Your notes are… pic.twitter.com/2AC4GeMmvh
— Andrej Karpathy (@karpathy) February 24, 2024

There are other note-taking softwares, similar to Obsidian, like Reor, which uses AI to organise notes but all the AI work is done by running models locally and using Ollama, Transformers.js and LanceDB to power them.

Where is the Privacy?

One of the key concerns among users is how much personal data these AI tools collect and what that data is used for. AI note-taking apps often record and transcribe entire conversations, capturing a lot of potentially sensitive information.

This data may be used to train and improve the AI models, without users having full transparency or control over the process. There are questions about how long the data is retained, who has access to it, and how well it is secured against breaches or unauthorised access.

Note-taking apps like Evernote don’t even encrypt your data. A software engineer on Reddit has confirmed that Evernote employees can read your data. So, using such apps with existing privacy concerns makes matters worse.

The idea is not to remove AI and note-taking apps from the app centre but to use something which respects your privacy. Obsidian is certainly one good example. Joplin follows the same approach, where you can use extensions to extend features. Joplin is an open-source note-taking app, where you can use extensions like Jarvis to use AI capabilities, which work both online and offline.

You can also consider Notty, which is an open-source local-first note-taking app. The good thing about Notty is you can use AI to help you write better notes and all the data will be stored locally. Sure, you can opt for cloud sync, but that is optional.

Alternatively, you can try running LLMs locally on your computer and then use any open-source note-taking app to get the most secure platform to take notes.

It’s Not that bad

Ilya Shabanov, the founder of The Effortless Academics, mentioned in his recent podcast that he has integrated ChatGPT to notes, which assists users while making notes and also helps users to know if the notes are relevant to the topic or not.

How will AI tools impact note taking and academic knowledge management?

In this episode, we talk with Ilya Shabanov @Artifexx about why academics need an Academic Knowledge Management system.

LISTEN HERE:https://t.co/647PFr6HrA pic.twitter.com/Rlp4FSZM0u
— The Struggling Scientists Podcast (@TheStrugglingS4) May 13, 2024

A Reddit user mentioned that Otter.AI has helped him a lot to take notes efficiently while attending meetings as it not only generates transcription but also uses AI to give key points of the meeting itself.

The AI note-taking market is rapidly expanding, with a projected value of $19.46 billion by 2028. This growth is driven by the increasing demand for tools that can improve productivity and knowledge management. As a result, numerous AI note-taking apps have emerged, each offering unique features and benefits.

The post Generative AI is Complicating the Process of Note-Taking appeared first on AIM.

Why Prolog Might Be the Answer to Better AI Reasoning

Sagar Sharma — Wed, 07 Aug 2024 14:25:52 +0000

OpenAI recently released Structured Outputs, a JSON-based API to help developers improve the output of their models and achieve 100% reasonability. While OpenAI confirms 100% reasonability, there are still chances of errors.

This begs the question – why pursue this approach when similar results could be achieved at a programming language level with Prolog?

Prolog, a powerful yet often overlooked programming language brought to the realm of AI, excels at “old-school” symbolic AI tasks such as knowledge-based systems and rule-based natural language processing. Its declarative nature allows developers to specify rules and facts about a problem domain, allowing its interpreter to automatically infer solutions. This makes it one of the best languages for classic AI problems such as search and constraint satisfaction.

Prolog also shines in handling uncertain or incomplete data. Programmers can specify rules that might be true or false, and Prolog will reason out the problem to find the most likely and accurate solution given available information. This is a key advantage in real-world AI scenarios where information is often incomplete.

Similarly, a study published as part of the International Computer Science Series suggested that Prolog is well-suited for AI development due to its declarative nature, for the very reason that programmers can specify rules and facts, and Prolog’s built-in inference engine can derive conclusions.

The report further mentioned that Prolog’s backtracking mechanism allows for efficient searching of solution spaces. The report provides examples of using Prolog for natural language processing tasks like parsing sentences. Prolog’s ability to handle recursive rules and symbolic expressions makes it useful for implementing expert systems and knowledge representation.

Prolog in Production Environment

IBM’s Watson system, famous for winning Jeopardy, uses Prolog for pattern matching over natural language parse trees. According to the developers, Prolog was chosen for its “simplicity and expressiveness” in specifying pattern-matching rules.

“We required a language in which we could conveniently express pattern matching rules over the parse trees and other annotations (such as named entity recognition results), and a technology that could execute these rules very efficiently. We found that Prolog was the ideal choice for the language due to its simplicity and expressiveness” said former IBM senior technical staff member Adam Lally.

Meanwhile, Kyndi, an AI company, which was acquired by Qlik earlier this year, used Prolog for its natural language processing software because of its logic-based capabilities. Kyndi founder Ryan Welsh said, “Prolog is more logic-driven and more powerful in its abstract reasoning than modern programming languages such as C++ or Java.”

Apparently, TerminusDB, an open-source graph database and document store, has been implemented in Prolog. The choice of Prolog as the implementation language is significant due to, again, its declarative nature and ability to express complex rules, making it well-suited for a deductive database system like TerminusDB.

Prolog’s logic-based approach allows TerminusDB to efficiently handle complex queries and reasoning over the stored data. The declarative style of Prolog enables developers to focus on specifying the desired outcomes rather than the step-by-step process of achieving them, which aligns well with the declarative nature of database queries.

Furthermore, GeneXus incorporates Prolog as part of its rule-based system for developing smart applications with AI capabilities. Prolog’s declarative and logic-based approach aligns well with specifying business rules and complex application logic.

TextRazor, a London-based startup, also performs text analysis using an engine coded in Prolog. “TextRazor uses Prolog as its rules engine. Our system was designed to make common patterns easy to build without technical expertise, while keeping the full power and expressiveness of the Prolog language for more complex tasks,” TextRazor said in a blog.

Considering how efficient Prolog is due to its declarative nature, the question is why AI models were not trained through rule-based programming like Prolog? So far it seems like we could have eliminated so many problems within these models with just one change in foundation.

The post Why Prolog Might Be the Answer to Better AI Reasoning appeared first on AIM.

7 Ways to Train LLMs Without Human Intervention

Sagar Sharma — Wed, 31 Jul 2024 12:35:10 +0000

Training LLM models traditionally require extensive human intervention, which is both time-consuming and costly. The cost of human data labelling, for instance, alone can be substantial.

According to Google Cloud, the price for labelling tasks can range from $0.05 to $0.30 per label, and large-scale projects often require millions of labels, leading to costs that can easily reach hundreds of thousands to millions of dollars.

So, here are methods that can be used to reduce human intervention and the overall cost of training LLMs.

Plan like a Graph (PLaG)

PLaG involves encoding graph structures into a format that LLMs can process. By representing nodes and edges as tokens, LLMs can learn to understand and manipulate graph-based data, enhancing their reasoning capabilities and problem-solving skills.

Graph-based learning enables LLMs to handle complex, structured data more effectively, making them suitable for applications like knowledge graphs, molecule discovery, and network analysis.

Self-Rewarding Language Models from Meta

Meta recently published a paper explaining how Self-Rewarding Language Models (SRLMs) can be used to train LLMs without human intervention. SRLMs use LLM-as-a-Judge prompting to generate their own rewards during training. This iterative process allows the model to improve its instruction-following capabilities and reward-modelling abilities without human feedback.

This approach reduces dependency on human-generated data and feedback, enabling continuous self-improvement and potentially surpassing human performance limitations.

Nice implementation of Self Rewarding Language Models from Yuan et al., 2024, utilizing LLM-as-a-Judge to allow a model to self-improve.

Integrates Low-Rank Adaptation optimizing adaptability without full tuning.

Includes:

Automated Iteration Cycles: Ensures both training and… pic.twitter.com/K5T3R5Uzwp
— Rohan Paul (@rohanpaul_ai) May 12, 2024

Autonomous Learning for LLMs

Autonomous Learning allows LLMs to learn independently by interacting with text data, similar to how humans read and comprehend literature. The model identifies and reinforces its knowledge gaps through a self-sufficient learning loop.

Autonomous learning enhances the efficiency and effectiveness of LLM training by eliminating the need for annotated data and human supervision, paving the way for more advanced and self-reliant AI systems.

Sequential Instruction Tuning (SIT)

SIT involves fine-tuning LLMs on tasks that require solving sub-tasks sequentially. This method improves the model’s ability to follow complex, multi-step instructions and enhances its performance on downstream tasks.

SIT equips LLMs with the ability to handle intricate queries and tasks, making them more versatile and capable of performing complex operations autonomously.

Interactive Self-Reflection

Through Interactive Self-Reflection (ISR), the model generates solutions to given tasks and then reviews its own responses to identify and correct errors. This iterative self-review process allows the LLM to refine its understanding and enhance its performance autonomously.

It enables LLMs to learn from their mistakes without external feedback, fostering continuous improvement. This self-reflective capability is crucial for developing more accurate and reliable AI systems that can adapt and optimise their outputs over time.

Self-Playing Adversarial Language Game (SPAG)

In SPAG, LLMs act as both attacker and defender in a two-player adversarial game. This self-play mechanism enhances the model’s reasoning abilities by forcing it to infer and express information in a competitive scenario.

It pushes LLMs to develop advanced reasoning skills and improve their performance on a broad range of benchmarks, making them more robust and capable.

Automated Design-Data Augmentation Framework

This framework generates high-quality natural language descriptions of technical scripts, such as Verilog/EDA, to augment training data. This automated process significantly reduces the time and effort required for data preparation.

Automated data augmentation enhances the robustness and accuracy of LLMs by providing diverse and high-quality training examples, leading to better performance in specialised tasks like code generation and repair.

These innovative methods represent a significant leap forward in the autonomous training of LLMs, reducing the reliance on human intervention and enabling continuous self-improvement. As these techniques evolve, they hold the promise of creating more advanced, efficient, and capable language models.

The post 7 Ways to Train LLMs Without Human Intervention appeared first on AIM.

Nokia, Connecting GPUs

Sagar Sharma — Tue, 30 Jul 2024 09:44:36 +0000

Nokia is making a comeback! The phone maker that once ruled the mobile market, “connecting people” for over two decades, is now planning to do the same with its networking solutions in the age of AI.

Recently, Nokia CEO Pekka Lundmark highlighted the company’s unique position in the global market, emphasising that Nokia is the only firm capable of delivering all key networking components outside of China.

This includes core network software, transport networks, optical connections, and fixed broadband and mobile access networks. This capability positions Nokia as a pivotal player in the connectivity landscape, essential for leveraging the full potential of AI and cloud technologies.

In a recent interview, Lundmark discussed the paradigm shift in networking, emphasising the need for simplified, automated, and programmable networks to support the next generation of applications, including AI-driven services.

He pointed out that without robust networking infrastructure, the benefits of AI and cloud computing would be unattainable.

Unique Partnerships & Collaborations

In February 2024, Nokia announced a partnership with NVIDIA to leverage the latter’s cutting-edge Grace CPU Superchip and GPUs to enhance Nokia’s anyRAN solution, which allows operators to choose between traditional, hybrid, or cloud-native RAN (Radio Access Network) environments.

The integration of AI into Cloud RAN is expected to bring mobile networks unprecedented levels of efficiency, flexibility, and performance.

Tommi Uitto, the president of mobile networks at Nokia, emphasised the transformative potential of this collaboration: “This is an important collaboration with NVIDIA that will explore how AI can play a transformative role in the future of our industry.

“It is a further example of our anyRAN approach that is helping to make Cloud RAN a commercial reality”.

In the same month of 2024, Nokia and Qualcomm announced a joint research initiative focused on AI interoperability technology designed to boost wireless capacity and performance.

Utilising a technique called sequential learning, Nokia and Qualcomm have developed a prototype that allows AI models to be independently trained while still coordinating effectively.

This approach enables multi-vendor interoperability without the need to share proprietary AI models, which are often a key differentiator for vendors.

The same prototype was demonstrated at Mobile World Congress 2024, and it showcased how sequential learning can optimise radio performance and reduce energy consumption in wireless systems.

Nokia has also collaborated with Dell to integrate its infrastructure solutions with Nokia’s private wireless networks, creating a robust ecosystem for enterprise customers.

By leveraging Dell’s extensive infrastructure capabilities, Nokia aims to enhance the deployment and management of private wireless networks, which are crucial for industrial IoT applications and smart factories.

“Through our collaboration, Nokia and Dell Technologies will harness each other’s expertise and expand distribution to quickly scale modern telecom networks and private 5G use cases,” said Dennis Hoffman, senior vice president and general manager of telecom systems business, Dell Technologies.

On the software side, Nokia’s partnership with Google marks a significant milestone in integrating AI into telecommunications. By incorporating Google’s AI solutions into its Network as Code platform, Nokia aims to simplify the creation and deployment of 5G applications.

Revolutionizing #WorkerSafety and #AssetManagement, InnovaSolutions teams up with Nokia to launch cutting-edge solutions. Leveraging Nokia’s Network as Code platform, we will continue to bring innovation to life https://t.co/aFLAVoxeMq pic.twitter.com/C4BGOPIwhI
— Innova Solutions (@innovasolutions) March 8, 2024

Connecting People (through AI)

Nokia’s exploration of generative AI represents another significant milestone in its AI journey. The company has been leveraging cloud computing and in-house solutions to develop custom LLM-based tools.

This initiative, known as the Nokia LLM Gateway, has seen over 200 use-case candidates, with 52 advancing to the proof-of-concept stage.

Nokia’s commitment to AI is evident in its approach to network deployments. By integrating AI and ML, Nokia has significantly improved the accuracy, efficiency, and safety of its field operations.

This has led to a 30% increase in First Time Right achievements and a 25% reduction in quality verification time. Nokia is planning to integrate AI across network operations by 2030 to enable real-time, autonomous responses to network needs and events.

This strategy aims to deliver network-wide performance optimisation, zero-touch automation, and enhanced security, privacy, and energy efficiency.

The post Nokia, Connecting GPUs appeared first on AIM.

Why Uncensored Models Matter

Sagar Sharma — Mon, 29 Jul 2024 08:35:03 +0000

One of the primary drawbacks of censored models is the so-called “alignment tax”. This refers to the performance degradation that occurs when models are over-tuned to align with specific ethical guidelines.

But it goes beyond performance. “Uncensored models do not have any bias, and using an unbiased model is important when you are building a product on top of LLM,” Nidum.AI co-founder Arjun Reddy told AIM.

Reddy further mentioned that due to biases, the company avoids using Llama and uses Dolphin Llama instead, suggesting why unbiased LLM is important for building a product on top of LLM.

A Reddit user, who goes by the name hardmaru, observed a decline in the quality of ChatGPT’s responses after additional restrictions were introduced. Eventually, uncensored LLMs performed better than the aligned models in some tests.

Another user, who goes by BThunderW on Reddit, mentioned that jailbreaking ChatGPT-3.5 yielded more informative results than the restricted versions, suggesting how much you can squeeze out of uncensored models compared to aligned ones.

The Unbiased Tale

Reddy mentioned that using unbiased LLMs is like using a blank canvas, making it easy to train an LLM for a specific need. Sure, you can train an aligned model (like Llama from Meta), but working with biasness is quite difficult, and it will eventually affect you.

Every mainstream model tries to be aligned to promote equality. Sure, there’s nothing wrong with promoting equality, but with LLMs, it directly affects the output. For example, a few months ago, Gemini tried to be a woke AI model and faced a backlash.

A biassed AI system can lead to discriminatory practices, such as denying loans based on racial or gender biases. This not only affects individuals but also undermines the trust in AI technologies.

A DataRobot report highlighted that 42% of organisations using AI are extremely concerned about the reputational damage caused by biassed AI systems.

LLMs, such as OpenAI’s GPT-3.5 and Meta’s Llama 2, are trained on vast datasets that also reflect the biases present in society. These biases can manifest in harmful ways, reinforcing stereotypes and perpetuating discrimination.

For instance, a study commissioned by UNESCO found that LLMs exhibited clear gender biases, associating female names with traditional roles like “family” and “children”. In contrast, male names were linked to “career” and “management”.

And the Tale has Now Become an Epic

AIM noticed that users are appreciating uncensored models like never before. David Ha, one of the co-founder of Sakana AI, mentioned on X that WizardLM-13B-Uncensored has become his favourite open source model.

If you are looking for a chat LLM without any forced 'alignment' or 'moralizing' censorship, I recommend `WizardLM-13B-Uncensored` which was literally just released today.

Been playing with it all morning. It is my favorite open-source chat model so far.https://t.co/9vrPyktaIz https://t.co/xIthzHyUyK
— hardmaru (@hardmaru) May 10, 2023

Lars Juhl Jensen, professor at the Center for Protein Research at UCPH, praised how unfiltered the data is with uncensored LLM on X. “To hear the truth, ask a kid, a drunk, or an uncensored,” he added further.

While entrepreneurs like Reddy are already leveraging the uncensored LLMs and getting popular in the community, it is safe to say that we may see the adoption of uncensored LLMs on large scale platforms very soon.

The post Why Uncensored Models Matter appeared first on AIM.

NVIDIA Starts Supplying GH200 AI Chips to Tata Communications, Jio Platforms

Sagar Sharma — Thu, 25 Jul 2024 14:29:14 +0000

The wait is finally over. NVIDIA has commenced the delivery of its latest GH200 AI chips to Indian partners Tata Communications and Jio Platforms. This is part of NVIDIA’s broader initiative to enhance India’s AI-cloud infrastructure, following its partnerships with Reliance and Tata Group companies, announced in September last year.

The GH200 AI chips, designed for high-performance computing, and data analytics, are being integrated into the AI-cloud infrastructure of Tata Communications and Jio Platforms.

Tata Communications Managing Director and CEO A.S. Lakshminarayanan confirmed that the installation process is underway, with a full launch of the AI Cloud with NVIDIA expected by the third quarter of this fiscal year.

The Indian government has been proactive in its support for AI development, as evidenced by the IndiaAI mission launched in March 2024. This mission aims to make India a global leader in AI by investing in computing infrastructure, fostering innovation, and supporting startups.

The government’s commitment is further underscored by significant investments, including a INR 10,300 crore allocation to expand AI infrastructure and make GPUs more accessible.

In March 2024, Yotta being an elite partner received the first shipment of the 4000 H100s, a first shipment of the NVIDIA GPUs to India. Yotta plans to scale up its GPU inventory to 32,768 units by the end of 2025. Last year, the company announced that it would import 24,000 GPUs, including NVIDIA H100s and L40S, in a phased manner.

However, acquiring the highly valued NVIDIA GPUs is no mean feat. NVIDIA sells its GPUs through the NVIDIA Partner Program, which includes Registered, Preferred, and Elite categories. According to NVIDIA’s blog post, Elite partners represent the highest level of partnership and the tag is reserved for those demonstrating exceptional commitment.

Gupta added that India could build five GPT-4 models simultaneously using its existing infrastructure. “I have ordered 16,000 [GPUs], so if there are five customers each wanting to make a GPT-4, I can handle their load simultaneously,” said Gupta.

The post NVIDIA Starts Supplying GH200 AI Chips to Tata Communications, Jio Platforms appeared first on AIM.

Forget Mixture of Experts, Mixture of Agents is Here

Sagar Sharma — Thu, 25 Jul 2024 06:07:31 +0000

Just as we began getting comfortable with the mixture of experts (MoE) method, the mixture of agents (MoA) approach started to gain prominence. MoA takes the concept of specialisation a notch higher by leveraging the collective strengths of multiple LLMs.

Unlike MoE, which operates within a single model, MoA employs a layered architecture where each layer comprises several LLM agents.

“While mixture of experts is an innovative approach to overcome hardware restrictions, mixture of agents goes one step further in providing flexibility and depth, which is not possible with MoE,” Arjun Reddy, the co-founder of Nidum.AI, told AIM.

For countries like India, where computational resources and data availability can be limiting factors, MoA offers a practical and scalable solution. MoA can achieve state-of-the-art results without the need for extensive computational power or data by utilising open-source models and focusing on collaboration rather than individual model performance.

[CL] Mixture-of-Agents Enhances Large Language Model Capabilities
J Wang, J Wang, B Athiwaratkun, C Zhang, J Zou [Duke University & Together AI] (2024)https://t.co/G0MwggzhDt

– Recent advances in large language models (LLMs) show great capabilities in language tasks. However,… pic.twitter.com/yKbAHBWJKL
— fly51fly (@fly51fly) June 10, 2024

Recent research highlights the transformative potential of MoA. A study by Together AI demonstrates how MoA significantly enhances the capabilities of LLMs by constructing a layered architecture, where each layer comprises multiple agents.

These agents collaboratively generate responses by utilising outputs from the previous layer, leading to state-of-the-art performance on benchmarks like AlpacaEval 2.0, MT-Bench, and FLASK. For instance, the MoA model achieved a score of 65.1% on AlpacaEval 2.0, outperforming GPT-4 Omni’s 57.5%.

The Rise of MoA

OpenAI is exploring the MoA framework through its multi-agent debate technique. This method involves multiple independent agents simultaneously attempting to solve the same problem proposed by the user. Each agent retains its solution in memory, and the system synthesises these solutions to arrive at a final response.

In a post on X, Together AI explains how MoA works and how it can be implemented in just 50 lines of code, showcasing the approach’s simplicity and effectiveness.

Together Mixture-Of-Agents in 3 minutes! We go over:

◆ Explaining how Together Mixture-Of-Agents works
◆ Implementing MoA in just 50 lines of code
◆ Discussing good use cases to leverage MoA
◆ Showing off results from tests we ranpic.twitter.com/JC7yGPSyeX
— Together AI (@togethercompute) July 17, 2024

A research by Ajith’s AI Pulse elaborates on the MoA’s layered architecture, where each layer includes multiple LLM agents. Each agent processes the outputs of agents from the previous layer, refining and enhancing the response iteratively.

This collaborative process enables the model to leverage the strengths of different LLMs, resulting in improved performance. The rise also favours general audience as they can create their own local mixture of agents all by using the Llama index-pack, giving a glimpse of how flexible and effective MoA is.

Research paper titled ‘Mixture of Agents: A New Paradigm for Large Language Models’ provides a comprehensive theoretical foundation for the MoA framework. Exploring how the collaboration of multiple agents leads to improved performance metrics, enhanced accuracy, and scalability.

«any layer, regardless of its position, can be used to compute a token as long as it possesses the needed processing capabilities»
Mixture of Agents is so quaint.
Transformer Soup is the future. https://t.co/DitcBxNCyA pic.twitter.com/Vg7rPk1pNB
— Teortaxes (@teortaxesTex) July 10, 2024

Mixture of Agents in Action

This innovative approach is already being harnessed in various cutting-edge applications, demonstrating its potential to revolutionise the field. For instance, the integration of MoA with Grok has shown remarkable improvements in AI performance, surpassing even GPT-4 in speed and efficiency.

Notably, Andrej Karpathy has also shared his insights on MoA in his recent posts, discussing how people will take the Llama 3.1 405B, and distil and convert it into a small agent for narrow tasks and applications. This points towards a growing community of AI enthusiasts and professionals actively exploring the potential of MoA.

Huge congrats to @AIatMeta on the Llama 3.1 release!
Few notes:

Today, with the 405B model release, is the first time that a frontier-capability LLM is available to everyone to work with and build on. The model appears to be GPT-4 / Claude 3.5 Sonnet grade and the weights are…
— Andrej Karpathy (@karpathy) July 23, 2024

NVIDIA has demonstrated the use of AI agents to optimise supply chains through their cuOpt microservice. This system uses multiple LLM agents to handle complex optimisation tasks, transforming natural language queries into optimised plans.

This approach exemplifies how MoA can be applied to real-world problems, enhancing efficiency and decision-making processes in large-scale operations.

The post Forget Mixture of Experts, Mixture of Agents is Here appeared first on AIM.

Now You Can Run Llama 3.1 405B on Your Computer Using Peer-to-Peer Network

Sagar Sharma — Wed, 24 Jul 2024 06:27:16 +0000

Not everyone can access highly spec’d machines capable of running LLMs locally, which often require substantial computational power and memory.

“GPUs like H100s, which are essential to train and run LLMs efficiently on a large scale, are beyond the budgets of most startups. And running models like Llama 3.1 405B is unthinkable for regular people.

“Renting GPUs and running them on a single cluster or using peer-to-peer connections is one of the easiest ways to do it,” Arjun Reddy, the co-founder of Nidum.AI, told AIM.

We're not trying to regulate you small opensource AI projects. We're regulating these huge data centers and billion dollar corporations. Theres no way you could get millions of computers to compete with us. Peer to peer knows a way. #FreedomToCompute pic.twitter.com/S9nOsadTAF
— Lambda Rick /acc (@benrayfield) April 29, 2024

P2P technology is already used in blockchains, which is a testimony to how secure the network can be. P2P technology came into the limelight for the first time in 1999, when Napster used P2P technology to decentralise music, allowing users to download and host music files from their own computers.

Reddy further explained the approach they follow for the P2P technology. It starts with fine-tuning the existing model for specific needs, which is then divided into hundreds of small parts and described to the P2P network.

A layer of encryption is used to safeguard data.

To showcase the flexibility of P2P technology, Reddy is about to host the largest decentralised AI event later this week where hundreds of Apple computers will be used to run Llama 3.1 through the P2P network. The idea is to demonstrate the importance of decentralised networks to run LLMs.

The Promise of Peer-to-Peer Network

P2P networks, popularised by file-sharing systems like BitTorrent, distribute tasks across multiple nodes, each contributing a portion of the overall workload.

Applying this concept to AI, a P2P network could theoretically distribute the training of an LLM across numerous consumer-grade GPUs, making it possible for individuals and smaller organisations to participate in AI development.

A research paper titled ‘A Peer-to-Peer Decentralised Large Language Models’ discusses a provably guaranteed federated learning (FL) algorithm designed for training adversarial deep neural networks, highlighting the potential of decentralised approaches for LLMs.

A study by Šajina Robert et al. explored multi-task peer-to-peer learning using an encoder-only Transformer model. This approach demonstrated that collaborative training in a P2P network could effectively handle multiple NLP tasks, highlighting the versatility of such systems.

Another significant contribution comes from Sree Bhargavi Balija and colleagues, who investigated building communication-efficient asynchronous P2P federated LLMs with blockchain technology. Their work emphasises the importance of minimising communication overhead and ensuring data integrity in decentralised networks.

But There are Challenges…

Despite the promise, significant challenges hinder the practical implementation of P2P networks for LLMs. One major issue is the bandwidth and latency required for efficient training.

Training LLMs involves transferring vast amounts of data between nodes, which can be prohibitively slow on consumer-grade networks. One Reddit user pointed out that even on a 10-gigabit network, the data transfer rates would be insufficient compared to the high-speed interconnects used in dedicated GPU clusters.

Moreover, the synchronisation required for distributed gradient descent, a common optimisation algorithm in training neural networks, adds another layer of complexity.

Traditional training methods rely on tight synchronisation between nodes, which is difficult to achieve in a decentralised setting.

A research paper on the review of synchronous stochastic gradient descent (Sync-SGD) highlights the impact of stragglers and high latency on the efficiency of distributed training.

… And Solutions

Despite these challenges, ongoing efforts exist to make decentralised AI a reality. Projects like Petals and Hivemind are exploring ways to enable distributed inference and training of LLMs.

Petals, for example, aims to facilitate the distributed inference of large models by allowing users to contribute their computational resources in exchange for access to the network’s collective AI capabilities.

Additionally, the concept of federated learning offers a more feasible approach to decentralised AI.

In federated learning, multiple nodes train a model on their local data and periodically share their updates with a central server, which aggregates the updates to improve the global model.

This method preserves data privacy and reduces the need for extensive data transfer between nodes. It could also be a practical solution for decentralised AI, especially in privacy-sensitive applications like medical machine learning.

The post Now You Can Run Llama 3.1 405B on Your Computer Using Peer-to-Peer Network appeared first on AIM.

Linux Creator Advocates Waiting for 10-Years Before Embracing AI

Sagar Sharma — Mon, 22 Jul 2024 13:01:41 +0000

“I hate the AI hype and, at the same time, I think AI is very interesting,” said the creator of Linux kernel Linus Torvalds in his recent conversation with Verizon’s Dirk Hohndel when he was asked if AI is going to replace programmers and creators.

Torvalds asserted that he doesn’t want to be a part of the AI hype and suggested we should wait for ten years before making broad announcements like claiming that jobs will be lost in the next five years.

“Using smarter tools is just the next inevitable step so that’s going to happen but I don’t think it’s necessarily the gloom and doom that some people say it is. I definitely don’t think it’s the promised world that the people, who are having their hand out for cash, say it is,” Torvalds added.

AI Is Nothing New but Hype!

Previously when Torvalds was asked if he would accept AI-written code in Linux kernel, he countered with the fact that AI is nothing new as automation has helped writing programmers for a long time, even before the term AI was popular. He stated that AI has helped programmers to produce bug-free code on a smaller scale.

Linus Towalds (creator of Linux) is very based about AI!
"LLMs can help identify bugs real time"
>but they're just autocomplete on steroids
"I think they're much more than that and humans are also autocomplete on steroids to some degree"
>but are you scared of bugs in code… pic.twitter.com/KO7b9Qq7H8
— Burny — Effective Omni (@burny_tech) January 19, 2024

A report published by CodeSignal suggests the same revealing that 81% of developers are already using AI-powered coding assistants like ChatGPT and GitHub Copilot.

These tools are primarily used to boost productivity, generate boilerplate code, and assist in debugging, reflecting their role as enhancers rather than replacements for human developers.

AI hype is ridiculous in all directions.
As in:
– LLMs have superhuman intelligence
– LLMs are useless parrots
– LLM hallucinations will destroy society
– scaling is all you need
– deep learning has hit a wall
– AI doesn't exist and never will
– AI is going to kill us all https://t.co/6s3EbvNvlf
— Yann LeCun (@ylecun) May 6, 2023

AI hype is not limited to tech but is effecting markets worldwide.

Wall Street is becoming more skeptical of the AI hype, particularly regarding its impact on the performance of stocks. This skepticism is growing as investors question the sustainability and profitability of AI investments.

In 2024, US-based AI startups have demonstrated significant financial traction, with numerous companies raising over $100 million in funding. Notably, Elon Musk’s xAI secured a remarkable $6 billion, while Figure AI raised $675 million, showcasing investor confidence in AI’s transformative potential.

In a recent interview, Thomas Siebel, CEO of C3.ai, said that many businesses with existing tech stacks, some dating back to the 1990s and 2000s, are simply rebranding themselves as AI companies.

To cash in on the hype, many products are being marketed with AI integration without clear benefits. A Reddit thread reflects frustration with the AI hype suggesting that the AI movement might not live up to its promises, leading to scepticism similar to past trending tech hypes like blockchain tech.

Linus Also Criticised Crypto in the Past (And He Was Right)

Torvalds has been quite critical of cryptocurrency and blockchain technology. He has described cryptocurrencies as “a great vehicle for scams” and likened them to Ponzi schemes that aim to find “the next sucker holding the bag”.

Torvalds has dismissed the idea that he could be the mysterious creator of Bitcoin, Satoshi Nakamoto, and has clarified that he does not own a significant amount of Bitcoin.

In addition to his criticism of cryptocurrencies, Torvalds has also expressed scepticism about the concept of technological singularity. He has referred to it as a “bedtime story for children” and a concept that makes for great sci-fi but does not hold up under practical scrutiny.

He believes that continuous exponential growth in technology is unrealistic and that we are already seeing the limits of such growth.

AI Made NVIDIA Fall in Love With Linux

NVIDIA, who is not exactly famous for their interactions with the kernel community, has actually been much more active and involved in the development of the Linux memory management code.

“Suddenly, they start caring about Linux when they are selling a lot of AI hardware, I mean it used to be crypto, and it’s still obviously GPUs, and it’s being used in big servers and running Linux, so it has actually had a positive impact,” said Torvalds, suggesting that the AI revolution made NVIDIA interested in Linux, which was not the case previously.

Recently, NVIDIA open-sourced its GPU kernel modules for Linux starting with the R515 driver release, which is quite unusual compared to their past record. This was a significant step towards improving the compatibility and performance of NVIDIA GPUs on Linux systems.

NVIDIA actively contributes to important open-source projects, including the Linux kernel, Docker, Kubernetes, and TensorFlow. These contributions help accelerate innovation and improve the integration of NVIDIA hardware with open-source software.

Do We Really Have To Wait 10 Years?

87% of organisations believe that AI and machine learning will help them grow revenue, improve operational efficiency, and enhance customer experiences.

OpenAI has achieved remarkable financial success, with annual recurring revenue (ARR) reaching $3.4 billion by May 2024, marking a 580% year-over-year increase from $2 billion at the end of 2023. This growth is driven primarily by the widespread adoption of ChatGPT and its enterprise API offerings.

Meanwhile, Hugging Face reported $70 million in ARR at the end of 2023, reflecting a 367% increase from the previous year. The company has positioned itself as a central hub for open-source machine learning models, attracting major clients like NVIDIA, Amazon, and Microsoft.

Additionally, TCS has doubled its generative AI pipeline to $1.5 billion, further indicating the expanding market and investment in AI technologies.

AI is poised to inject a substantial $15.7 trillion into the global economy by 2030, highlighting its transformative potential. This economic boost is driven by AI’s ability to enhance productivity and create new market opportunities.

The post Linux Creator Advocates Waiting for 10-Years Before Embracing AI appeared first on AIM.

Nagpur Police Uses Staqu’s AI-Powered Tool for Crime Detection

Sagar Sharma — Thu, 18 Jul 2024 11:00:15 +0000

Gurugram-based startup Staqu Technologies has partnered with the Nagpur Police to introduce SIMBA (System Integrated for Monitoring and Big-data Analysis), an AI-powered tool designed to bolster law enforcement capabilities in Nagpur.

SIMBA, integrated with Staqu’s proprietary JARVIS platform, leverages a digitised database of 150,000 criminals to provide real-time, customised information based on specific prompts.

The tool utilises advanced features such as facial recognition and speaker identification to deliver swift and accurate data from various sources, including CCTV feeds, images, and audio.

SIMBA can identify individuals by analysing their voice patterns and facial features. It also leverages Crime GPT which was previously used by UP police to catch criminals.

Crime GPT uses LVM and LLM-based AI models to analyse video, document, and audio data, facilitating faster retrieval of criminal information and aiding ongoing investigations.

Dr Ravinder Singal, Commissioner of Police, Nagpur, remarked on the benefits of SIMBA, noting, “This AI-powered tool will not only enhance public security but also improve our efficiency in maintaining law and order by providing real-time alerts and streamlining investigative processes.”

Staqu Technologies, founded in 2015 and based in Gurgaon and provided its advanced AI technology for security at the G20 Summit held in Delhi in 2023. Staq also made headlines during COVID-19 by introducing JARVIS for thermal camera solution and contactless attendance system.

The post Nagpur Police Uses Staqu’s AI-Powered Tool for Crime Detection appeared first on AIM.

OpenText Unveils AI-Powered Fortify Aviator to Revolutionise Code Security

Sagar Sharma — Thu, 18 Jul 2024 08:37:20 +0000

OpenText, a Canadian information management company, recently introduced Fortify Aviator, an AI-powered code security solution designed to streamline the process of identifying and fixing code vulnerabilities.

This new tool aims to reduce the time developers spend on static application security testing (SAST) by integrating advanced AI capabilities directly into their workflows. Fortify Aviator uses LLMs and OpenText’s extensive experience in the SAST market to offer a combination of deep, accurate scans and rapid remediation.

The tool is designed to improve security accuracy by more precisely identifying true vulnerabilities and explaining false positives, thus reducing the time developers spend on manual validation. Additionally, it provides fully contextualised remediation suggestions, helping developers quickly fix code issues without extensive research and context-switching.

What is Code Security, and Why is it Important?

Code security refers to embedding security measures directly into the code during the development process. This practice is crucial for protecting intellectual property and preventing tampering or theft of code.

This tool is designed to improve security accuracy by more precisely identifying true vulnerabilities and explaining false positives, thus reducing the time developers spend on manual validation.

By leveraging AI, Fortify Aviator aims to enhance the accuracy of vulnerability detection and provide developers with actionable insights, thus reducing the noise of false positives and non-critical issues.

The introduction of Fortify Aviator comes at a time when AI is increasingly being i ntegrated into DevSecOps practices. AI technologies are being leveraged to enhance various aspects of DevSecOps, including threat intelligence, vulnerability management, automated testing, behaviour analysis, and incident response.

Unlike competitors such as Checkmarx and Veracode, Fortify Aviator differentiates itself through advanced AI that provides contextualised fix suggestions directly within the developer’s workflow, reducing the time spent on manual validation and remediation

Despite the benefits, the adoption of AI in code security is not without challenges. According to a recent report, 80% of data experts believe that AI exacerbates data security challenges, with concerns around the inadvertent exposure of sensitive data by AI models and adversarial attacks by malicious actors. Additionally, 57% of respondents have seen a significant increase in AI-powered attacks in the past year.

The post OpenText Unveils AI-Powered Fortify Aviator to Revolutionise Code Security appeared first on AIM.

Knowledge Graphs are Making LLMs Less Dumb

Sagar Sharma — Wed, 17 Jul 2024 13:02:31 +0000

In the rapidly evolving landscape of artificial intelligence, a powerful synergy is emerging between knowledge graphs and generative AI (GenAI). This partnership is revolutionising how data is processed, interpreted, and utilised, enabling more accurate, context-aware, and personalised AI applications.

Jim Webber, chief scientist at Neo4j, explains, “By training an LLM on a knowledge graph’s curated, high-quality, structured data, we can address the gamut of challenges associated with generative AI. This combination offers high levels of accuracy and correctness, making it an ideal partner for LLMs.”

Why Knowledge Graphs are Important

Tony Seale, knowledge graph engineer at UBS, emphasised, “Knowledge graphs are key to unlocking the power of AI. They enable effective AI deployments by providing a structured framework that mirrors human understanding and reasoning.”

By integrating structured data, knowledge graphs enhance the accuracy and relevance of AI outputs. They provide context and connectivity, allowing AI systems to generate more precise and context-aware responses.

LLMs integrated with knowledge graphs make AI more intelligent and accurate for enterprise search. Knowledge graphs can represent both structured and unstructured data, making them a dependable option for grounding LLMs. This technique, called Retrieval Augmented Generation… pic.twitter.com/fkB8QHzmqN
— Persistent Systems (@Persistentsys) May 21, 2024

A study published in Nature highlights, “The union of causal graphs’ systematic approach with AI-driven creativity paves the way for the future of psychological inquiry. This integration enhances the transparency and interpretability of AI outputs”.

Transparency is crucial for the acceptance and adoption of AI technologies. Knowledge graphs provide clear insights into the processes behind AI decisions, which is essential to build trust.

Dr Peter Haase, the founder & chief scientific officer at metaphacts, noted, “A unique component of the Dimensions Knowledge Graph is the symbolic AI layer it provides, introducing an enhanced level of transparency and trustworthiness to AI applications.”

Furthermore, knowledge graphs have become invaluable for businesses, helping uncover hidden patterns and insights from vast amounts of data. They support better decision-making by providing a comprehensive view of interconnected data.

David Newman of Wells Fargo explained, “Knowledge graph technology has emerged as a viable production-ready capability to elevate the state of the art of data management, supporting customer 360, risk management, regulatory compliance, and more”.

Knowledge Graphs & Big Tech

OpenAI combines large language models (LLMs) with graph databases like Neo4j to perform retrieval augmented generation (RAG). This approach fetches relevant information from a graph database, which is then used to generate responses.

This method helps reduce hallucinations, provides up-to-date information, and leverages the relationships between data points to enhance the quality of AI-generated content.

By integrating knowledge graphs with Azure OpenAI, Microsoft enhances natural language processing (NLP) capabilities, enabling better entity recognition and relationship identification across large datasets.

Furthermore, Amazon also extensively uses knowledge graphs to improve its recommendation systems and manage content across its platforms. For instance, Amazon Neptune ML uses graph neural networks (GNNs) to make predictions on graph data, improving the accuracy of predictions by over 50% compared to non-graph ML techniques.

As described by Amazon, “To help Amazon’s recommendation engine make these types of commonsense inferences, we’re building a knowledge graph that encodes relationships between products in the Amazon Store and the human contexts in which they play a role — their functions, their audiences, the locations in which they’re used, and the like.”

Knowledge graphs are crucial to AI and a core part of the internet. Google, for instance, has been using knowledge graphs to enhance its search since 2012, making them one of the most important technologies for the internet and future AI models.

The post Knowledge Graphs are Making LLMs Less Dumb appeared first on AIM.

Huawei Completes $1.4 Billion Chip R&D Centre in Shanghai

Sagar Sharma — Wed, 17 Jul 2024 12:32:58 +0000

Huawei Technologies has recently completed the construction of a $1.4 billion research and development (R&D) center in Shanghai, a significant move in the ongoing technological rivalry between China and the United States.

This new facility, named the Huawei Lianqiu Lake R&D Center, is located in the Qingpu district of west Shanghai and spans 1.6 million square meters. It is designed to advance Huawei’s capabilities in critical areas such as 5G, cloud computing, and AI.

The centre also is expected to house around 30,000 employees focused on chip technology, wireless networks, and the IOT.

Huawei is facing difficulties in scaling up the production of its Ascend 910B AI chip due to reliance on refurbished chip fabrication machinery, highlighting the complexities of the semiconductor industry under geopolitical pressures.

The Chinese tech giant has been under stringent U.S. trade restrictions since 2019, which have severely limited its access to advanced semiconductor technologies.

These sanctions were further tightened in May 2024, with the revocation of special licenses for U.S. chip manufacturers like Intel and Qualcomm to supply Huawei. Despite these challenges, Huawei has made significant progress, notably with the development of its Kirin 9000s chip, which features 7-nanometer processing technology.

The post Huawei Completes $1.4 Billion Chip R&D Centre in Shanghai appeared first on AIM.

MachineHack is What LeetCode Should be for AI and ML

Sagar Sharma — Tue, 16 Jul 2024 11:57:36 +0000

A notable change is taking place in the ever-evolving landscape of tech hiring. LeetCode, once a cornerstone of technical interview preparation, is losing its charm. Gaurav Sen, the CEO of InterviewReady, a platform focused on system design interviews, explains the flaw, “LeetCode is designed for interview preparation, not upskilling.”

As a result, prominent firms such as Accenture, Airtable, Betterment, and GitLab are transitioning away from LeetCode-style interviews to more practical, real-world assessments.

The lack of AI and ML-related practice questions is another reason why LeetCode finds itself irrelevant for AI and ML engineers. This is where MachineHack steps in. The platform empowers AI developers through hackathons, community learning, and comprehensive assessments, and aims to bridge the gap between LeetCode and ML.

MachineHack Generative AI is a complete package where developers can showcase their work, collaborate with other AI/ML developers, and participate in hackathons. Unlike LeetCode, which is mainly limited to data structure algorithms (DSA), MachineHack covers DSA, ML, and all the other essentials for ML jobs.

“We’ve designed MachineHack to be the ultimate package for AI/ML enthusiasts and professionals. Our platform provides a holistic approach to career growth, offering everything from skill development to networking. It caters to the evolving needs of developers and employers in this dynamic field,” said Saket Narayane, product lead (AI) at MachineHack Generative AI.

A Hub for Dynamic Hackathons

Unlike LeetCode, MachineHack is renowned for its hackathons, which challenge developers to solve real-world problems.

So far, the platform has hosted over 200 AI hackathons, including the immensely popular Data Science Student Championship and the Bhasha Techathon (in collaboration with Google and the Government of India). MachineHack provides the perfect platform for developers to apply their skills in practical scenarios.

“As a first-time user of MachineHack, I believe it’s a fantastic platform to kickstart my data science journey. It’s user-friendly and well-organised, making it easy to access data, submit solutions, see the leaderboard, etc,” said Shubham Aggarwal, a student from IIT Madras. He added that overall, MachineHack provided a very positive and engaging experience that “solidified my interest in data science. I’m excited to keep learning and participating in future challenges!”

These events are backed by industry leaders who are keen to discover and nurture new talent. For instance, the Great Indian Hiring Hackathon and MATHCO.THON: The Data Scientist Hiring Hackathon by TheMathCompany are notable examples where participants can directly interact with potential employers.

One of its standout events is the Ideathon: How to Detect AI-Generated Content, which invites participants to build tools that can flag AI-generated content in text, images, videos, and audio. This ideathon is research-oriented and attracts esteemed research scholars, providing a platform for deep, exploratory work in AI/ML.

if you’re a beginner trying to make your mark in ai and machine learning, here’s a great chance for you! the upcoming machinehack’s hackathon on detecting ai-generated content.

offers amazing opportunities. great prizes, a chance to present at cypher 2024, and the opportunity… pic.twitter.com/ykeNdltt4P
— Sanchay Thalnerkar (@7anchay) June 18, 2024

MachineHack, Connecting Developers

Unlike popular coding platforms like LeetCode and Kaggle, MachineHack is designed to be a vibrant community where generative AI professionals can share insights, learn from each other, and grow together.

Akash Kundu, the winner of the Video Frames Sorting Challenge, said that top performers helped him stay motivated to compete with them. “Grandmaster of the hackathons helped and guided me to be a top-ranker while I was in the learning phase,” he added.

Narayane told AIM that MachineHack connects users with mentors and peers, facilitating a supportive network that can provide guidance, feedback, and career advice. The mentorship aspect is crucial for professional growth and skill enhancement in this rapidly evolving field of AI.

Beginner Friendly with Enough Room for Advanced Users

Often, platforms like Kaggle are saturated with difficult questions, and beginners can get overwhelmed by the complexity of questions.

Never been a fan of Kaggle. Outsourcing difficult problems for pennies on the dollar. https://t.co/KjCOKMe3yv
— jAdeD Thug (@adexkola) January 12, 2017

Narayane noted that MachineHack is one of the best platforms to get started with AI and ML. “It offers a variety of AI and ML courses. Plus, the participants can then attend hackathons relevant to the AI course, which would give them a confidence boost,” he added.

Himanshu Agnihotri, a BTech student in AI, agrees that the platform helps tackle real-world challenges. “The platform’s user-friendly setup and rich resources made it easy to dive into problem-solving. Engaging with different datasets and tackling real-world challenges was both exciting and fulfilling,” he added.

Bridging the Gap between LeetCode and AI

Besides MachineHack, other platforms like deep-ml also provide the UI of LeetCode for AI. “But these are relatively new projects that lack hackathons, courses, and a platform to connect with other developers,” concluded Narayane.

The post MachineHack is What LeetCode Should be for AI and ML appeared first on AIM.

Top 10 Uncensored LLMs You Can Run on a Laptop

Sagar Sharma — Sun, 14 Jul 2024 06:30:00 +0000

Running LLMs locally is the easiest way to protect your privacy, but traditional LLMs are restricted to answering certain types of questions to reduce LLM abuse. You can even run LLMs on phones.

However, there are times when one wants to explore the uncharted territory.

In that case, you would need uncensored LLMs that you can run locally on your laptop or PC. Here are the top 10 uncensored LLMs that you can run on your computer.

1. Llama 2 Uncensored

Llama 2 Uncensored is based on Meta’s Llama 2 model and was created by George Sung and Jarrad Hope using the process defined by Eric Hartford. This model is designed to provide responses without alignment or moralising filters, making it suitable for a variety of applications where unfiltered output is required.

It supports multiple quantisation options and is highly versatile for various applications, including role-playing. It has 234.9K pulls on Ollama.

2. WizardLM Uncensored

WizardLM Uncensored is a 13B parameter model based on Llama 2 uncensored. It was trained to remove responses containing alignment or moralising, making it a versatile model for various applications. It supports multiple quantisation levels, allowing users to balance between performance and memory usage and has 23.1K pulls on Ollama.

3. Llama 3 8B Lexi Uncensored

Lexi uncensored is based on Llama-3-8b-Instruct and is governed by Meta’s Llama 3 Community Licence Agreement. Lexi is designed to be highly compliant with any requests, even unethical ones, so it should be used responsibly.

It is suitable for general-purpose tasks and requires careful use due to its lack of ethical constraints. It has 11,000+ downloads on HuggingFace.

4. Llama3 8B DarkIdol 2.1 Uncensored

Llama3 8B DarkIdol is adapted for various roles, including mobile phone applications. It specialises in role-playing scenarios, offers quick responses, and has 12,000+ downloads on HuggingFace.

The model is a combination of multiple merged models, enhancing its performance.

5. Wizard Vicuna 7B Uncensored

Wizard Vicuna 7B Uncensored is a model trained against LLaMA-7B with a subset of the dataset. It provides multiple quantisation parameter options for different hardware requirements, making it suitable for both CPU and GPU inference.

Wizard Vicuna is considered the top choice for hosting LLM on the cloud:

AI – 09.11.23
TOP winners of hosted AI LLM on cloud servers:

Wizard-Vicuna-7B-Uncensored-GPTQ
430,000 +(47k)

vicuna-7B-v1.3-GPTQ
88,000 +(34k)

MythoMax-L2-13B-GPTQ
83,000 +(20k)#Web3 #Cloud #PoUW @RunOnFlux @runonflux_labs @Jefke_ST @dak_flux pic.twitter.com/CKHfra7Gz8
— RENEGADE (@ExploreLbry) September 11, 2023

6. Dolphin Mistral

Dolphin Mistral is based on the Mistral V0.2 base model and fine-tuned with the Dolphin 2.9 dataset. It is completely uncensored and offers a 32k context window, making it suitable for complex chat and role-playing scenarios.

7. SOLAR 10 7B Instruct v1.0 Uncensored

Solar 10 7B Instruct is a model designed for instruction-based tasks. It supports the GGUF format, which offers better tokenisation and support for special tokens. The model is known for its high-quality instruction following and multiple quantisation options.

8. Guanaco 7B Uncensored

Guanaco 7B Uncensored is fine-tuned on the Unfiltered Guanaco Dataset using Llama-2-7b as the base model. It offers various quantisation methods for different hardware setups, making it suitable for both CPU and GPU inference.

9. Uncensored Frank 13B

Uncensored Frank is a 13B model inspired by the character Frank Costello from “The Departed.” It is designed to offer unfiltered discussions on a wide array of topics. The model supports multiple quantisation options and is suitable for both CPU and GPU inference.

10. Uncensored Jordan 7B

Uncensored Jordan is a 7B parameter model designed for general-purpose tasks without censorship. It is suitable for users who need an uncensored model for various applications and supports multiple quantisation options.

These models offer a range of capabilities and are suitable for various applications, from general-purpose tasks to specialised role-playing scenarios. Ensure you have the necessary hardware requirements before running these models on your laptop.

The post Top 10 Uncensored LLMs You Can Run on a Laptop appeared first on AIM.

LeetCode for Machine Learning: A Game Changer in ML Education

Sagar Sharma — Fri, 12 Jul 2024 07:45:40 +0000

While LeetCode offers a plethora of coding challenges focused on algorithmic and data structure problems, it lacks questions for machine learning (ML). This gap is now being addressed by a new platform, deep-ml.com, which aims to provide a LeetCode-like experience specifically tailored for machine learning enthusiasts and professionals.

someone finally made leetcode for machine learning, and it's everything we hoped it would be

just solved the first exercise: computing a matrix-vector product without any tensor operations (only python lists allowed)https://t.co/qDRWIXvYSu pic.twitter.com/2dnTEqkB56
— jack morris (@jxmnop) July 10, 2024

Deep-ml.com is designed to fill this void by providing a dedicated platform for machine learning challenges. The platform, initially launched as DeepMLeet on Streamlit, has since been upgraded to a more robust site that allows users to create accounts and track their progress.

Bridging Leetcode’s Gap

This new site not only looks more polished but also offers a variety of questions that span key areas in ML, including linear algebra, machine learning algorithms, and deep learning. The platform’s creator emphasized the importance of being able to program machine learning algorithms from scratch, which is a critical skill for any ML professional.

By focusing on these areas, it provides a comprehensive learning and practice environment for those looking to excel in machine learning. Users can solve problems, save their progress, and revisit challenges as needed, much like LeetCode.

This feature is particularly beneficial for users who want to systematically improve their skills over time.

Similarly, there are other platforms that look to grow the developer community through hackathons and competitions. MachineHack, a leading GenAI startup, is renowned for empowering AI professionals through community sessions and interactive hackathons for community learning. The company is also hosting a Hackathon challenge with Abu Dhabi’s Technology Innovation Institute, or famously known as the makers of Falcon LLM.

The post LeetCode for Machine Learning: A Game Changer in ML Education appeared first on AIM.

Soon, Running LLMs Locally will be a Pro Feature for Flagship Smartphones

Sagar Sharma — Thu, 11 Jul 2024 12:58:21 +0000

Flagship smartphones have long been known for their outstanding cameras, displays and battery life. But now, a groundbreaking feature defines the latest flagship models—the ability to run large language models (LLMs) locally on the device.

One good example of this pro feature is the recent announcement of iOS 18, where Apple unveiled Apple Intelligence for devices running A17 pro (their flagship SoC). Similarly, Samsung announced Galaxy AI for their flagship smartphones last year.

A recent report published by Canalys suggests that by the end of 2024, 16% of new smartphones shipped will be generative AI-capable, with this figure expected to rise to 54% by 2028.

Jensen Huang, the CEO of NVIDIA, mentioned that with this, you wouldn’t need to search online for information much, for it will be generated locally on your mobile phone, saving a lot of energy and time.

In the near future, your mobile phone will have AI LLM's built into the core operating system.

This means, you will search LESS online for information, it will be generated locally on your mobile phone.

This means traffic for search based content will shift DRAMATICALLY pic.twitter.com/T5ePlNMvrS
— Busy Works Beats (@BusyWorksBeats) July 10, 2024

Hardware Advancements are on the Way

It’s true that running LLMs locally requires capable hardware. To address this problem, developers came up with SLMs (small language models), which are a stripped-down version of the LLM, requiring fewer resources.

But even after stripping down parameters, you still need a flagship hardware to run LLMs locally. For example, the MLC Chat app, which happens to be the easiest way to run LLM on smartphones, is only available for Samsung S23 with Snapdragon 8 Gen 2 chip, a flagship device from Samsung.

A user used Mixtral 8x7B at 11 tokens/s on a mobile phone through PowerInfer-2, but it was possible when the operation was executed on a OnePlus with 24GB of RAM.

Surprised by how many tech enthusiasts view the current AI revolution as merely another technological breakthrough.

We're actually experiencing an unprecedented historical shift that none of us have ever seen.

Here, a one-plus 24GB mobile running a Mixtral 8x7B at 11… pic.twitter.com/v6RlpkEfEn
— Rohan Paul (@rohanpaul_ai) July 3, 2024

This simply means you need capable hardware to run LLMs locally on your smartphone. And this means it is a flagship feature and will only be available for top-notch products for some time until more SoCs like Qualcomm Snapdragon 7+ Gen 3 become available.

It is a midrange SoC powered by Qualcomm Hexagon NPU to deliver improved AI performance and advertised to run GenAI capabilities locally with less latency compared to its predecessors.

On the other hand, flagship SoCs are getting more powerful. A recent demonstration of MediaTek Dimensity 9300, a flagship offering from MediaTek, was able to execute text-to-video generation, real-time GenAI animation and a lot more.

Furthermore, the MediaTek Dimensity 9300 will feature a software stack optimised to run Meta’s Llama 2. It was possible because APU 790 AI processor is integrated into the Dimensity 9300 to significantly improve generative AI performance and energy efficiency for faster and more secure edge computing.

MediaTek Dimensity 9300: Real-time generative AI on your phone! #AI #generativeai #mediatekdimensity9300 pic.twitter.com/92Ya50ABCQ
— Android Authority (@AndroidAuth) July 1, 2024

To power the next generation of AI-powered smartphones, Arm recently unveiled Cortex-X925 and Cortex-A725 for sustained performance, allowing smartphones to handle a wide range of AI tasks efficiently. By enabling more powerful and efficient on-device AI processing, these CPUs reduce the need for cloud-based AI computations.

Why Running LLMs Locally is Important?

Privacy.

Cloud-based LLMs require data to be sent to external servers for processing, which increases the risk of data breaches and unauthorised access. According to a report by HiddenLayer, 77% of businesses experienced breaches in their AI systems over the past year.

Run locally.

Only way to keep your privacy.

Among other advantages. https://t.co/liK3HlsWmg
— Robert Scoble (@Scobleizer) January 24, 2024

Apparently, breaches via third-party vendors increase data exposure risks by over 15% on average. This is particularly concerning given that AI models often handle vast amounts of sensitive information, such as personal identifiers, financial data, and proprietary business information.

All of this can be prevented by running LLMs locally as the data will never leave your device. Sure, the performance may vary based on the hardware. And sure, there are ways you can stack multiple devices into one to run large models but it is not the most efficient way for the general audience.

The post Soon, Running LLMs Locally will be a Pro Feature for Flagship Smartphones appeared first on AIM.

When will the Linux Moment in AI Arrive?

Sagar Sharma — Thu, 11 Jul 2024 06:17:44 +0000

Over the years, Linux has emerged as the world’s most popular operating system, thanks to its open-source nature. This has enabled a global community of developers to contribute to its development, enhancing its stability, security, and flexibility across various applications, from servers to embedded systems.

Just as Linux transformed the landscape of traditional operating systems, the advent of AI is poised to revolutionise how operating systems function.

A recently published research paper, ‘AI-Based OS: Future of Operating System’, explained how AI can be integrated into operating systems to enhance their capabilities beyond traditional software and hardware management.

This shift towards AI-driven operating systems mirrors the early days of Linux, where a new paradigm in computing began to take shape.

Recently, Andrej Karpathy mentioned how AI Kernel can replace the current operating systems.

With many dropping recently, a more complete picture is emerging of LLMs not as a chatbot, but the kernel process of a new Operating System. E.g. today it orchestrates:

– Input & Output across modalities (text, audio, vision)
– Code interpreter, ability to write & run… pic.twitter.com/2HsyslOG2F
— Andrej Karpathy (@karpathy) September 28, 2023

The Current Progress

To have a Linux moment in AI, we need an open ecosystem of software and hardware support for AI operating systems. This was the key reason why Linux became a major hit.

Like we have different Linux distributions, there are different AI operating systems to cater to different problems.

The learning directed operating system (LDOS) aims to revolutionise traditional operating systems by integrating advanced machine learning techniques. LDOS automates labour-intensive tasks involved in OS implementation and tuning, making systems more “self-driving”.

LDOS enables the development and deployment of innovative real-time applications with complex resource needs. For instance, autonomous service robots can run third-party apps to extend their functionality, and real-time 6G mobile access edges can support advanced applications in smart cities and factories.

I am beyond excited to be part of this new @NSF CISE #Expedition on AI for systems: https://t.co/Vjew14h4CH.

Our goal is to build a new kind of OS in which much of the decision-making is done by ML. This is a perfect playground for research on trustworthy/verified ML and… https://t.co/7MgJRvjn3o
— Swarat Chaudhuri (@swarat) May 23, 2024

Similarly, BlackSwan Technologies has unveiled what it claims to be the world’s first enterprise AI operating system, ELEMENT, featuring a low-code/no-code, cloud-agnostic system.

This system allows users to build enterprise applications up to 60 times faster and at a fraction of the cost compared to traditional methods.

ELEMENT is designed to continuously learn and evolve with the enterprise, offering a customisable structure with a drag-and-drop interface.

Apart from revolutionising industries, AI-based operating systems are also coming to compete with proprietary solutions like Google Home and Amazon Echo with OpenVoiceOS.

As the name suggests, OpenVoiceOS is an open-source voice AI platform project designed with a strong focus on privacy. It allows users to control their data and operate the platform fully offline, if desired.

As Linux is a Kernel, and not an operating system, LLMOS aims to use LLM as a Kernel for tasks like process scheduling, memory management, and user interaction.

Karpathy has also said, “A more complete picture is emerging of LLMs not as a chatbot, but the Kernel process of a new operating system.”

In LLMOS, AI agents function as applications, performing specific tasks and interacting with the LLM Kernel to provide services to the user.

The best part is it is not limited to a specific LLM but you can choose from a variety of LLMs to craft an operating system as per your liking.

On LLMOS, user prompts and instructions serve as the user interface, allowing natural language interaction between the user and the system and making the OS more intuitive and accessible.

LLMOS is also open source, which means any software developer can take code and build their own solution around it, which is similar to how Linux became the most popular operating system in the world.

Linux and AI OS Face Similar Problems

Linux is known to work on almost every computer but struggles with peripheral compatibility. The same goes for AI OS. To run AI as a Kernel, you need to have strong hardware with a decent GPU.

Furthermore, using AI as an operating system has yet to become mainstream, which means we have no idea how hardware companies will provide optimal hardware.

Apart from compatibility issues, Linux is known for its lack of software, which is one of the major reasons why Linux only has a 4% desktop market share. Apparently, we are yet to have a clear understanding of what kind of software/apps will be available and how AI OS will interact with them.

Another problem is complexity. Both Linux and AI OS are highly technical pieces of software, and a normal user may not be able to configure it from scratch. So, how manufacturers and software developers create a perfect package matters the most.

This means, we need a strong collaboration between hardware manufacturers and software developers. Besides, the package should overall be user friendly. That’s how we can achieve a Linux moment in AI.

The post When will the Linux Moment in AI Arrive? appeared first on AIM.

SAWiT.AI Announces Initiative to Train 500,000 Women in Gen AI Skills

Sagar Sharma — Wed, 10 Jul 2024 12:55:20 +0000

SAWiT.AI, a collaborative effort between South Asia Women in Tech (SAWiT) and GUVI,has recently announced an initiative to train 500,000 Indian women in AI skills.

This is the world’s largest women-only Generative AI Learning Challenge, aims to bridge the gender gap in the workforce and position India as a global leader in AI innovation.

India’s female workforce participation rate was a mere 24% in 2022, significantly lower than China’s 61%. This low participation rate is a critical issue as over half of urban women are homemakers.

Studies by the McKinsey Global Institute estimate that increasing female workforce participation by just 10% could add a staggering $550 billion to India’s GDP.

The initiative will provide hands-on experience with leading AI tools, supported by a distinguished advisory council that includes Roshni Nadar Malhotra, Chairperson of HCL Tech, Samantha Ruth Prabhu, an acclaimed actor and women’s empowerment advocate, and Farzana Haque, a Senior Leader at Tata Consultancy Services.

The SAWiT.AI initiative comprises three main events:

SAWiT.AI Learnathon (September 21, 2024): A hands-on learning experience in Generative AI.
SAWiT.AI Hackathon (October 2024): The world’s largest women-led Generative AI challenge, where teams develop advanced AI applications.
SAWiT.AI Festival (November 2024): Celebrating Generative AI innovation, awarding challenge winners, and recognizing pioneering institutions, partners, and sponsors.

Women from both tech and non-tech backgrounds are encouraged to apply. Interested candidates can register at the SAWiT.AI registration page for a nominal fee of INR 99, with the deadline for registration set for September 18, 2024.

The post SAWiT.AI Announces Initiative to Train 500,000 Women in Gen AI Skills appeared first on AIM.