WASHINGTON
—
Large
Language
Models
haven’t
achieved
human-like
consciousness
and
transformed
or
shattered
society
—
at
least
not
yet
—
as
prominent
figures
like
Elon
Musk
suggested
early
in
the
hype
cycle.
But
they
also
haven’t
been
crippled
to
the
point
of
inutility
by
their
tendency
to
“hallucinate”
false
answers.
Instead,
generative
AI
is
emerging
as
a
useful
tool
for
a
wide
but
hardly
unlimited
range
of
purposes,
from
summarizing
reams
of
regulations
to
drafting
procurement
memoranda
and
supply
plans.
So,
two
years
after
the
public
unveiling
of
ChatGPT,
16
months
after
the
Department
of
Defense
launched
Task
Force
Lima
to
figure
out
the
perils
and
potential
of
generative
AI,
the
Pentagon’s
Chief
Digital
&
AI
Office
(CDAO)
effectively
declared
the
new
technology
was
adequately
understood
and
sufficiently
safeguarded
to
deploy.
On
Dec.
11
the
CDAO
officially
wrapped
up
the
exploratory
task
force
a
few
months
ahead
of
schedule,
institutionalized
its
findings,
and
created
a
standing
AI
Rapid
Capabilities
Cell
(AIRCC)
with
$100
million
in
seed
funding
to
accelerate
GenAI
adoption
across
the
DoD.
[This
article
is
one
of many
in
a
series
in
which
Breaking
Defense
reporters
look
back
on
the
most
significant
(and
entertaining)
news
stories
of
2024
and
look
forward
to
what
2025
may
hold.]
The
AIRCC’s
forthcoming
pilot
projects
are
hardly
the
first
Pentagon
deployments
of
GenAI.
The
Air
Force
gave
its
personnel
access
to
a
chatbot
called
NIPRGPT
in
June,
for
example,
while
the
Army
deployed
a
GenAI
system
by
Ask
Sage
that
could
even
be
used
to
draft
formal
acquisition
documents.
But
these
two
cases
also
show
the
kinds
of
“guardrails”
the
Pentagon
believes
are
necessary
to
safely
and
responsibly
use
generative
AI.
RELATED:
In
AI
we
trust:
how
DoD’s
Task
Force
Lima
can
safeguard
generative
AI
for
warfighters
To
start
with,
neither
AI
is
on
the
open
internet:
They
both
run
only
on
closed
Defense
Department
networks
—
the
Army
cloud
for
Ask
Sage,
the
DoD-wide
NIPRnet
for
NIPRPT.
That
sequestration
helps
prevent
leakage
of
users’
inputs,
such
as
detailed
prompts
which
might
reveal
sensitive
information.
Commercial
chatbots,
by
contrast,
often
suck
up
everything
their
users
tell
them
to
feed
their
insatiable
appetite
for
training
data,
and
it’s
possible
to
prompt
them
in
such
a
way
that
they
regurgitate,
verbatim,
the
original
information
they’ve
been
fed
—
something
the
military
definitely
doesn’t
want
to
happen.
Another
increasingly
common
safeguard
to
run
the
user’s
input
through
multiple
Large
Language
Models
and
use
them
to
doublecheck
each
other.
Ask
Sage,
for
instance,
has
over
150
different
models
under
the
hood.
That
way,
while
any
individual
AI
may
still
hallucinate
random
absurdities,
it’s
unlikely
that
two
completely
different
models
from
different
makers
will
generate
the
same
mistakes.
Finally,
in
2024
it
became
a
best
practice
in
both
DoD
and
the
private
sector
to
put
generative
AI
on
a
diet,
feeding
it
only
carefully
selected
and
trustworthy
data,
often
using
a
process
called
Retrieval
Augmented
Generation
(RAG).
By
contrast,
many
free
public
chatbots
were
trained
on
vast
swathes
of
the
Internet,
without
any
human
factchecking
beforehand
or
any
algorithmic
ability
to
detect
errors,
frauds,
or
outright
jokes
—
like
an
old
Reddit
post
about
putting
glue
on
pizza
that
Google’s
AI
began
regurgitating
as
a
serious
recipe
in
one
notable
example
this
year.
Some
defense
officials
said
this
year
they
a
savvy
adversary
could
go
further
and
deliberately
insert
errors
into
training
data,
“poisoning”
any
AI
built
on
it
to
make
errors
they
could
exploit.
By
contrast,
the
Pentagon
prefers
AIs
which
are
trained
on
official
documents
and
other
government
datasets,
and
which
cite
specific
pages
and
paragraphs
as
supporting
evidence
for
their
answers
so
the
human
user
can
double-check
for
themselves.
None
of
these
safeguards
is
surefire,
and
it’s
still
possible
for
generative
AI
to
go
wrong.
But
at
least
the
guardrails
are
now
strong
enough
that
the
Pentagon
feels
safe
to
drive
ahead
into
2025.