The law firm of choice for internationally focused companies

+263 242 744 677

admin@tsazim.com

4 Gunhill Avenue,

Harare, Zimbabwe

Benchmarks And Outcomes – ‘Moneyball’ For GenAI (Part I) – Above the Law

It’s
October.
The
seasons
are
changing.
The
air
is
growing
crisper.
And
people
in
the
United
States
are
beginning
to
take
more
interest
in
Major
League
Baseball
as
the
World
Series
fast
approaches.
In
celebration
of
the
season,
we
invite
legal
professionals
to
revisit
Moneyball,”
the
2011
sports
drama
starring
Brad
Pitt
and
directed
by
Bennett
Miller.

Pitt
plays
Billy
Beane,
who,
it
should
noted
upfront,
is
neither
a
lawyer
nor
an
AI
expert.
He
is
however
the
general
manager
of
the
2002
Oakland
A’s,
a
lifelong
student
of
the
game
who
struggled
to
consistently
produce
a
winning
team
in
a
small
baseball
market.
Beane
didn’t
have
the
budget
to
compete
for
players
like
teams
in
larger
markets
like
New
York
or
Boston.
The
A’s
could
develop
players,
but
they
couldn’t
retain
them
when
they
became
stars.
They
did,
however,
have
access
to
the
vital
statistics.

Baseball
is
a
sport
with
a
century
of
data
behind
it
and
benchmarks
like
a
player’s
batting
average
are
known
by
even
casual
fans.
What
Billy
and
the
A’s
did
was
use

analytics

and
different
key
performance
indicators
to
win.
A
player’s
batting
average
is
a
great
metric,
but
it
doesn’t
account
for
other
factors
like
the
ability
to
get
on
base
so
on-base
percentage
is
better.
Getting
on
base
leads
to
more
runs
scored.
And
more
runs
scored
means
winning
more
games.

When
losing
a
star
player
like
Jason
Giambi
to
the
New
York
Yankees
or
Johnny
Damon
to
the
Boston
Red
Sox,
conventional
wisdom
would
say
the
team
would
have
to
replace
two
stars.
But
Beane,
recognized
the
A’s
needed
to
replace
the

production

of
the
players
they
lost.
In
aggregate,
they
needed
to
get
on
base
as
much
as
the
prior
year.
Using
analytics,
the
A’s
would
go
on
to
set
an
American
League
record
by
winning
20
straight
games
in
a
row
and
also
made
it
to
the
World
Series.
But
the
way
they
did
it
was
the
bigger
story.

So
what
can
“Moneyball”
teach
us
about
benchmarking
AI
in
legal?

The

Stanford
study
on
benchmarking
GenAI
solutions

this
past
summer
moved
the
conversation
forward
regarding
the
usefulness
and
impact
of
GenAI
solutions.
The
study
was
not
without
some
controversy
that
also
helped
in
generating
awareness
to
an
important
topic:
How
do
we
measure
the
results
of
GenAI
on
the
legal
industry.

The
Stanford
study
tested
leading
research
products
on
their
ability
to
create
answers
to
questions
related
to
caselaw
research.
A
correct
answer
was
one
that
accurately
reflected
the
current
state
of
the
law.
An
answer
that
did
not
reflect
the
current
state
of
the
law
was
considered
a
hallucination.
The
result?
One
in
six
queries
hallucinated.

The
definition
of
hallucination
in
the
study
is
great
for
benchmarking.
But
does
a
hallucination
as
defined
in
the
study
always
equal
a
bad
outcome?
What
if
the
answer
moved
your
research
in
the
right
direction
and
then
you
were
able
to
formulate
a
Boolean
search
that
answered
your
question?
Another
run
crossing
home
plate.

And
what
about
associates
using
traditional
research
solutions?
Has
anyone
benchmarked
their
legal
research
skills
to
see
how
often
their
conclusions
do
not
reflect
the
current
state
of
the
law?

The
key
points
are:

  • Benchmarks
    are
    important,
    and
    the
    right
    benchmarks
    for
    your
    goals
    are
    more
    important
    than
    what
    is
    easy
    to
    measure.
  • Benchmarks
    on
    new
    approaches
    need
    to
    reflect
    the
    context
    of
    the
    effectiveness
    of
    current
    approaches.
  • Outcomes
    are
    more
    important
    than
    benchmarks.

Outcomes
are
always
interesting.
The
goals
of
two
organizations
can
differ.
And
what
counts
as
winning
or
a
positive
outcome
at
one
level
may
be
different
at
another
level
of
an
organization.

An
entertaining
television
advertisement
that
viewers
recall,
is
considered
a
winner
in
the
advertising
world.
But
what
if
viewers
can’t
remember
the
name
of
the
advertiser?
What
if
there
is
no
discernable
uptick
in
sales
activity
as
a
result
of
the
ad
campaign?
Recall
of
an
ad
can
be
an
example
of
a
vanity
metric

something
that
that
is
perhaps
easy
to
measure
but
doesn’t
support
decisions
that
a
business
or
law
firm
should
make. 
The
same
pitfalls
can
apply
to
measuring
the
efficacy
of
GenAI
solutions.
Is
what
we
are
measuring
aligned
with
outcomes
for
the
firm?

To
be
sure,
goals
and
outcomes
can
change
over
time.
Billy
Beane
came
up
with
a
winning
strategy
to
confront
the
realities
of
being
a
general
manager
in
a
small
market.
Circumstances
have
changed:
On
September
26,
2024,
the
Oakland
A’s
played
their
last
game
in
Oakland
as
they
prepared
for
an

eventual
move
to
Las
Vegas
,
a
much
bigger
market
with
its
own
unique
challenges.

Next
month,
I’ll
explore
different
use
cases
for
legal
GenAI
and
relate
the
performance
of
tools
to
positive
outcomes.
Said
another
way,
I’ll
explore
how
to
identify
getting
on
base
to
score
runs
to
win
games
with
legal
GenAI.




Ken Crutchfield HeadshotKen
Crutchfield
is
Vice
President
and
General
Manager
of
Legal
Markets
at
Wolters
Kluwer
Legal
&
Regulatory
U.S.,
a
leading
provider
of
information,
business
intelligence,
regulatory
and
legal
workflow
solutions.
Ken
has
more
than
three
decades
of
experience
as
a
leader
in
information
and
software
solutions
across
industries.
He
can
be
reached
at 
[email protected].