Upstage enhances document processing with small language models

South Korean company Upstage is advancing document processing by developing small language models, catering to specific industry needs and promising enhanced accuracy.

In a significant development in the field of artificial intelligence (AI), the South Korean company Upstage is making strides in enhancing document processing through the development of small language models (SLMs). Established initially to utilise optical character recognition (OCR) to aid large corporations in document management, Upstage has pivoted towards addressing the rising demand for advanced language models since the introduction of ChatGPT.

Located in Las Vegas for an exhibition at AWS re:Invent, Upstage’s co-founder and chief product officer, Lucy Park, discussed the company’s evolution in an interview. “Customers wanted a language model that was fit for their own use,” she said, highlighting the impetus behind the creation of SLMs. While Upstage had achieved an impressive 95% accuracy rate with its OCR technology, the company recognised that clients were seeking a solution that could deliver 100% accuracy, prompting the exploration of more suitable models for document processing tasks.

Despite limited attention surrounding SLMs compared to larger counterparts, their distinctive capabilities enable corporations to develop tailored language models that cater specifically to their requirements, whether these be industry or locale-specific. Upstage’s focus on SLMs allows them to concentrate on specialised applications, rather than the broad applicability found in large language models (LLMs). This targeted approach has potential ramifications for business practices, particularly in sectors that rely heavily on precise document processing.

The company is leveraging open-source models within its AI architecture, facilitating operations on a singular graphics processing unit (GPU). Their flagship model, known as Solar, competes with notable models like Llama 3.81 B and Hugging Face’s ExaOne3.0 7.8B Instruct. Park elaborated on the innovative technique of model merging, an approach gaining traction in the AI sector. This method involves integrating smaller LLMs into larger frameworks, resulting in models like combining a 7 billion parameter model with a 10 billion parameter model. “If we have a 14 billion model, we explode that into a 22 billion model,” she explained.

Research has indicated that model merging permits the construction of universal models without necessitating access to the extensive original training data or incurring significant computational costs. A collaborative paper from researchers at Nanyang Technological University, Northeastern University, and Sun Yat-sen University confirmed these advantages. Upstage has reported noteworthy enhancements in its benchmarks from this combined model strategy, with their Solar Pro model achieving a 64% improvement in language mastery in Eastern Asia compared to earlier iterations.

The company’s commitment to small language models is underscored by the specificity they can achieve. For instance, Upstage has successfully developed a model tailored for the Thai language, utilising methodologies comparable to those of the GPT-4 model from OpenAI. Moreover, the financial aspect of SLMs presents a compelling case for businesses; Park noted that the hypothetical cost of developing an SLM could be as low as $10, while a larger LLM could amount to approximately $100.

Customers have three primary deployment options for these models. They can choose to implement on-premise solutions via the Upstage console, utilising Application Programming Interfaces (APIs) through the AWS marketplace. Notably, the Solar Pro model has recently become available on the Amazon Bedrock Marketplace, further expanding its reach and accessibility for organisations seeking advanced document processing technologies.

Source: Noah Wire Services