Remote Usability Testing Tools Compared: UserTesting, Maze, Lookback, Hotjar, FullStory

Remote Usability Testing Tools Compared: UserTesting, Maze, Lookback, Hotjar, FullStory

Remote usability testing tools split into two categories: moderated research platforms (Lookback, UserTesting) that support live facilitated sessions, and behavioral analytics tools (Hotjar, FullStory) that passively observe real user behavior at scale. Maze sits in between — unmoderated task-based testing with your own participants. The right tool depends on whether you're asking "why do users struggle?" (moderated), "how often do users struggle?" (analytics), or "can users complete this task?" (unmoderated). Most mature teams use at least two.

Key Takeaways

UserTesting gives you fast, broad-panel access at significant cost. Best when you need results quickly and don't have your own participants to recruit. The platform panel quality varies; screeners matter enormously.

Maze is purpose-built for unmoderated task-based testing of designs. Strong Figma integration makes it the default choice for design validation before development. Less suited to live product testing.

Lookback is the moderated sessions specialist. Supports live moderated remote sessions with strong observer features. Better for deep qualitative research than quick-turnaround testing.

Hotjar is the accessibility entry point for behavioral analytics. Heatmaps, session recordings, and basic feedback at a price that makes it viable for small teams. Not a research platform — a signal source.

FullStory is enterprise behavioral analytics. Product analytics with full session replay, DX Data, and integrations. Answers different questions than any of the moderated/unmoderated tools.

The remote usability testing market has matured significantly. Where teams once chose between "hire a research firm" and "do nothing," there's now a rich ecosystem of tools covering everything from automated participant recruitment to AI-assisted session analysis. The challenge isn't finding a tool — it's knowing which category of tool to use for which question.

This comparison covers the five tools that appear most frequently in product and UX team stacks: UserTesting, Maze, Lookback, Hotjar, and FullStory. Each occupies a distinct position in the research stack.

How to Think About These Tools

Before comparing individual tools, it helps to understand the two distinct categories:

Moderated and unmoderated research platforms (UserTesting, Maze, Lookback) are designed for deliberate research — you have a question, you design tasks, you recruit participants, you collect data. The output is qualitative findings, task completion rates, and behavioral observations.

Behavioral analytics platforms (Hotjar, FullStory) observe real users doing real things with your live product, without tasks or facilitation. The output is behavioral patterns — where users click, where they drop off, what they do before or after a key action.

The research platforms help you understand why users behave as they do. The analytics platforms show you how users behave at scale. Most mature teams use at least one of each.

UserTesting

UserTesting is the most established remote usability testing platform, built around a large managed participant panel and a study design/delivery interface. You create a test, set screener criteria, and UserTesting recruits participants from its network — results typically arrive within hours.

What It Does Well

Participant supply. UserTesting's panel includes millions of participants globally, with demographic and behavioral segmentation. If you need 15 participants who are e-commerce shoppers in a specific income bracket within 4 hours, UserTesting can usually deliver.

Video and audio recording. Every session is recorded with screen capture and audio. The platform provides automatic transcription and AI-generated highlights. You can clip and share session moments directly.

Test templates. UserTesting provides pre-built templates for common research scenarios — first-click testing, comparative testing, tree testing — which reduces setup time for teams new to research.

Variety of test types. Supports task-based tests, prototype tests, card sorts, surveys, and concept tests within one platform.

Limitations

Cost. UserTesting is expensive. Business plans run several hundred dollars per response; enterprise contracts are typically five to six figures annually. It's not a tool for teams without research budget.

Panel quality variability. Participants on managed panels are accustomed to testing. They may not behave like your actual users, and heavy participants ("professional testers") can produce atypical responses. Screener quality is critical.

Limited moderation support. UserTesting's live conversation feature exists but is not the platform's strength. For live moderated sessions, other tools are better.

When to Use UserTesting

  • You need fast results with no time to recruit your own participants
  • You're testing consumer products where broad demographic reach matters
  • You need a one-platform solution for unmoderated testing and don't want to manage multiple tools

Pricing (approximate): Business plans start at ~$30,000/year. Starter plans for individuals are cheaper but limit panel access.

Maze

Maze is purpose-built for unmoderated design testing — primarily for Figma prototypes, but also for live URLs. It's built around the concept of missions (tasks with specific success criteria) and provides quantitative metrics like task completion rate, time on task, and misclick rate.

What It Does Well

Figma integration. Maze's integration with Figma is the tightest in the market. If your design process is Figma-based, you can go from a prototype to a running test in minutes. This makes it the default tool for many design teams.

Quantitative metrics. Where UserTesting is qualitative-first, Maze is quantitative-first. Task completion rates, click maps, and path analysis give you numbers, not just observations.

Bring-your-own participants. Maze generates shareable test links that you can send to your own participants — useful when you have access to customers and don't want to pay for panel participants.

Price. Maze's pricing is significantly lower than UserTesting, with a free tier and paid plans starting around $99/month, making it accessible to smaller teams.

Limitations

Limited qualitative depth. Without video and audio recording (limited in the free tier, available at higher tiers), you get behavioral metrics but not the qualitative richness of session recordings.

Prototype-centric. While Maze supports live URL testing, its strongest features are designed for prototype testing. For live product testing, it's less differentiated from alternatives.

No moderation. Maze is unmoderated only. If you need live facilitated sessions, you need a different tool.

When to Use Maze

  • You're a design-led team doing pre-development validation on Figma prototypes
  • You need quantitative task completion metrics alongside qualitative observations
  • You have your own participants and need a way to deliver tests and collect data
  • Budget is a constraint

Pricing: Free tier available; paid plans from ~$99/month per seat.

Lookback

Lookback is a research platform specialized for moderated sessions — live, facilitated usability studies conducted via video call, with strong support for observer access and note-taking. It also supports unmoderated self-interview sessions.

What It Does Well

Live moderation. Lookback's core strength is hosting live research sessions. Participants join via a link, share their screen, and the facilitator conducts the session with full video, audio, and screen capture. Observers can watch in real time (silently) and take notes without the participant seeing them.

Observer room. The observer feature is one of Lookback's most valued differentiators. Stakeholders who can't always schedule time to sit in on research can observe asynchronously, reducing the time required to share findings.

Self-interview / diary studies. Lookback's self-interview mode lets participants record themselves responding to prompts on their own schedule — useful for diary studies, longitudinal research, and research with geographically distributed participants.

Session annotation. During live sessions, observers can timestamp moments of interest. This dramatically speeds up analysis — rather than watching full recordings, you navigate to the annotated moments.

Limitations

Logistics overhead. Moderated research is inherently slower and more expensive per insight than unmoderated. Lookback makes moderation easier but doesn't change its fundamental cost structure.

No managed panel. Lookback does not provide a participant panel. You bring your own participants. This makes it the wrong choice if you need fast results from a pool you don't have.

Complexity. For teams new to research, Lookback's feature set can feel overwhelming. It's a professional tool built for researchers, not a quick-setup testing solution.

When to Use Lookback

  • You're running formal moderated usability research with your own recruited participants
  • You need stakeholder access to sessions without bringing them into the session itself
  • You're running diary studies or longitudinal research
  • Your research ops are mature enough to handle participant recruitment independently

Pricing: Plans from ~$25/month per seat; team plans with more sessions are higher.

Hotjar

Hotjar is a behavioral analytics tool that collects heatmaps, session recordings, and user feedback on live products. It's not a usability testing platform in the traditional sense — it doesn't support tasks, participants, or facilitated sessions. Instead, it observes real users doing real things.

What It Does Well

Heatmaps. Hotjar's heatmaps (click, move, and scroll) show aggregate patterns across all visitors to a page. Where do users click? How far do they scroll? Where does attention concentrate? These are answerable with heatmap data.

Session recordings. Hotjar records individual user sessions, showing exactly what a user did on a page. Playback is filterable by behavior (e.g., "sessions where the user rage-clicked").

Feedback widgets. Hotjar's on-page survey and feedback widgets let you ask users questions in context — "what stopped you from completing your order?" on the abandoned cart page, for example.

Accessibility. Hotjar's pricing starts low (free tier for small sites) and scales gradually. It's accessible to teams that can't afford enterprise research platforms.

Limitations

Not a research platform. Hotjar doesn't help you run studies. There are no tasks, no participants, no facilitation. It observes; it doesn't investigate.

Quantitative patterns, not qualitative insights. Hotjar tells you that 40% of users don't scroll below the fold. It doesn't tell you why. For the "why," you need a research tool.

Sampling limitations. By default, Hotjar samples sessions rather than recording all of them. At high traffic, this is fine; at low traffic, small sample sizes reduce reliability.

When to Use Hotjar

  • You want passive behavioral signal on your live product without recruiting participants
  • You need heatmaps to understand where attention and clicks concentrate
  • You want to identify high-drop-off pages that warrant deeper research investigation
  • Budget is very tight — Hotjar's free tier provides usable data

Pricing: Free for up to 35 daily sessions; paid plans from ~$32/month.

FullStory

FullStory is an enterprise product analytics platform with comprehensive session capture, DX Data (quantified behavioral metrics), and integrations with product and customer success tooling. It competes more directly with Heap and Amplitude than with UserTesting or Maze.

What It Does Well

Complete session capture. FullStory captures every session (within your subscription limits) and provides DX Autocapture — all interactions are recorded without instrumentation. You don't need to pre-decide what to track.

DX Data. FullStory quantifies qualitative behaviors — frustration signals (rage clicks, dead clicks, error clicks), engagement, form abandonment — and makes them searchable and aggregatable. This turns session replay from a qualitative tool into a quantitative one.

Integrations. FullStory integrates with CRM, customer success, and support platforms. When a customer reports an issue, a support agent can watch the session replay that preceded the ticket.

Search. FullStory's session search is powerful — find sessions where a specific element was clicked, where a specific error occurred, or where a specific user (by ID) had trouble.

Limitations

Enterprise price. FullStory is not cheap. Pricing is custom and typically runs five to six figures annually for meaningful session volumes.

Not a research tool. Like Hotjar, FullStory observes rather than investigates. It tells you what happened, not why.

Implementation complexity. Getting full value from FullStory requires integration work and organizational processes for actually reviewing and acting on session data.

When to Use FullStory

  • You're an enterprise product team that needs comprehensive session data and integrations
  • You want to investigate support tickets or customer complaints using session replay
  • You need quantified behavioral metrics at scale (not just anecdotal observations)

Pricing: Custom enterprise pricing; not suitable for small teams on limited budgets.

Side-by-Side Summary

UserTesting Maze Lookback Hotjar FullStory
Category Unmoderated Unmoderated Moderated Analytics Analytics
Participant panel Yes (large) Optional No N/A N/A
Live sessions Limited No Yes No No
Video/audio Yes Limited Yes Session replay Session replay
Quantitative metrics Some Yes Some Yes Yes
Figma integration No Yes No No No
Best for Fast broad research Design validation Deep qualitative Passive observation Enterprise analytics
Entry price ~$30k/year Free/$99/mo ~$25/mo Free Custom

Combining Tools for a Full Research Stack

Most mature product teams end up with two or three tools from this list serving different purposes:

A common stack: Maze for rapid prototype validation before development, Lookback for deep moderated sessions with customers, and Hotjar (or FullStory at scale) for ongoing behavioral analytics on the live product.

This combination covers the full research cycle: validate designs early, investigate specific problems deeply, and monitor behavioral patterns continuously.

Pairing Research Tools With Functional QA

Research tools answer behavioral questions. For functional correctness — whether your product works — you need QA coverage. HelpMeTest automates functional testing on live products, so regressions get caught before users encounter them. When automated QA handles the "does it work?" question, your research stack can focus entirely on the "can users use it?" question — the right division of labor for teams that care about both dimensions of quality.

Conclusion

Choosing a remote usability testing tool is really about choosing which questions to answer and with what evidence. Moderated sessions (Lookback) give depth. Unmoderated task tests (Maze, UserTesting) give speed and some scale. Behavioral analytics (Hotjar, FullStory) give breadth and continuity.

The teams that get the most value from these tools aren't the ones who pick the most sophisticated platform — they're the ones who know what question they're trying to answer before they open a dashboard.

Read more