Designing dozens of icon variations is both an essential and challenging task for many game studios. In this tutorial, I’ll share my workflow for creating a hypercasual style icon from scratch.
The full process
Before delving into the specifics of creating icons for the fictional game “Office Shooting”, here is a high-level overview of the process and the tools utilized:
- Data set creation from a 3D model of a stickman
- Model creation using Dreambooth
- Image generation with Stable Diffusion using the Automatic1111 GUI
- Photobashing the selected image in Photoshop
- Styling the image with different models in Automatic 1111
Dataset
The most critical element in crafting an AI model on a particular subject or style is the data set, comprising images that the model learns from. For my data set, I used a 3D model of a stickman from 3Dexport to generate around 70 images of stickmen from various angles. Here’s an example:
The size and diversity of your data set greatly contribute to the quality of the model. For each image in the set, I associated a caption in a text file. I used the non-existing word “ohwx” as a token for my subject, and “stickman” as the class. Therefore, these two words were utilized in the prompt to generate the stickman images once the model was ready.
Training
Training was conducted using Dreambooth, a deep learning generation model utilized to fine-tune existing text-to-image models. The Dreambooth extension for Automatic 1111 was instrumental in model creation, with the entire training process on 70 images taking roughly 25 minutes.
Generating images
With the fine-tuned model, I was able to generate an impressive array of stickmen in various colors and actions, such as wielding weapons, throwing spears, and generally acting in their worst behaviors :)
The prompts used were simple, such as “ohwx white stickman holding a machine gun” or “ohwx red stickman throwing a spear”
From these, I selected an image to serve as the icon for my new fictitious game, Office Shooting.
Photobashing
Photobashing is a technique employed by non-designers like myself, where images are crudely combined, aiming primarily at achieving the desired composition. To my selected stickman, I added a dagger, painted some blood splatters (which proved to be distracting), and incorporated a briefcase using Photoshop’s generative fill feature.
Here are the results of the photobashing:
Styling
Once I was satisfied with the composition, I used ControlNet, an essential extension for Automatic1111, to replicate the composition accurately but with a significantly enhanced style. The process was straightforward: I selected a few models known for their excellent results, such as Rev Animated and Deliberate (and many more available on Civitai.com).
Then I adjusted ControlNet’s settings until I achieved satisfactory outcomes.
Conclusion
Creating the data set took about an hour, another hour for training, and approximately 30 minutes for photobashing and styling — a total of around 2.5 hours. Now that I have a readily available stickman model, future icons featuring stickmen will require considerably less time.
However, the time factor is only part of the story. This technology offers a plethora of stylistic options to choose from and experiment with. Gaming companies could leverage this to conduct extensive A/B testing with sometimes radically different rough ideas, gradually honing in on a handful of favored options for further refinement and detail work.
Dori Adar is the owner of Hands on Games, a consultation company that help art teams make the leap into AI creation.