Advertisement

Opinion | This year, prepare for the magic wand of generative AI to transform our world

  • Generative AI is becoming active, multimodal and being embedded in PCs and smartphones, penetrating our daily lives in profound ways
  • Governments must pay attention to ethical, privacy and copyright issues to ensure the positive development of the technology

Reading Time:4 minutes
Why you can trust SCMP
Customs inspector Hui Ka-hei explains how XiaoHui, a virtual ambassador introduced at Hong Kong’s airport and three control points, works to to handle public enquiries, on December 29. Photo: Dickson Lee
Since the debut of ChatGPT at the end of 2022, generative artificial intelligence (AI) has gained popularity globally. Various industries are eagerly adopting generative AI for diverse applications including the creation of text, images, videos, computer codes and more.
Advertisement
The output of generative AI in these domains has proliferated, presenting a diverse array of creative work. According to Gartner, an information technology (IT) research and consulting company in the United States, use of generative AI will increase substantially by 2026. It projects that more than 80 per cent of businesses will have integrated generative AI application programming interfaces (APIs), models and/or applications into their operations by then, from less than 5 per cent now.

IT experts believe that generative AI will play a transformative role this year, akin to a magic wand that changes the world. In light of these developments, what major trends in generative AI can we expect this year?

Last year, most generative AI tools, algorithms and large language models were primarily focused on a single media type: text, images, speech or video, with limited emphasis on the simultaneous processing of multiple media. But with the emergence of models like GPT-4, generative AI is shifting towards multi-modal expression, enabling the simultaneous handling of diverse forms of data.

This suggests that human-machine communication will one day effortlessly incorporate mixed media, as ChatGPT is starting to do. Users can input non-textual cues, such as images, making the conversation more convenient and natural.

Advertisement

Furthermore, multi-modal generative AI is poised to enhance the creative capabilities of artists and improve the efficiency of professionals in fields such as writing, journalism and music. When appropriately used, multi-modal generative AI can spark artistic inspiration, generate work drafts or even fully automate content creation.

Advertisement