llm
gpu_energy(model_active_parameter_count, output_token_count, gpu_energy_alpha, gpu_energy_beta)
Compute energy consumption of a single GPU.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_active_parameter_count |
float
|
Number of active parameters of the model. |
required |
output_token_count |
float
|
Number of generated tokens. |
required |
gpu_energy_alpha |
float
|
Alpha parameter of the GPU linear power consumption profile. |
required |
gpu_energy_beta |
float
|
Beta parameter of the GPU linear power consumption profile. |
required |
Returns:
Type | Description |
---|---|
float
|
The energy consumption of a single GPU. |
Source code in ecologits/impacts/llm.py
generation_latency(model_active_parameter_count, output_token_count, gpu_latency_alpha, gpu_latency_beta, request_latency)
Compute the token generation latency in seconds.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_active_parameter_count |
float
|
Number of active parameters of the model. |
required |
output_token_count |
float
|
Number of generated tokens. |
required |
gpu_latency_alpha |
float
|
Alpha parameter of the GPU linear latency profile. |
required |
gpu_latency_beta |
float
|
Beta parameter of the GPU linear latency profile. |
required |
request_latency |
float
|
Measured request latency (upper bound) in seconds. |
required |
Returns:
Type | Description |
---|---|
float
|
The token generation latency in seconds. |
Source code in ecologits/impacts/llm.py
model_required_memory(model_total_parameter_count, model_quantization_bits)
Compute the required memory to load the model on GPU.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_total_parameter_count |
float
|
Number of parameters of the model. |
required |
model_quantization_bits |
int
|
Number of bits used to represent the model weights. |
required |
Returns:
Type | Description |
---|---|
float
|
The amount of required GPU memory to load the model. |
Source code in ecologits/impacts/llm.py
gpu_required_count(model_required_memory, gpu_memory)
Compute the number of required GPU to store the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_required_memory |
float
|
Required memory to load the model on GPU. |
required |
gpu_memory |
float
|
Amount of memory available on a single GPU. |
required |
Returns:
Type | Description |
---|---|
int
|
The number of required GPUs to load the model. |
Source code in ecologits/impacts/llm.py
server_energy(generation_latency, server_power, server_gpu_count, gpu_required_count)
Compute the energy consumption of the server.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
generation_latency |
float
|
Token generation latency in seconds. |
required |
server_power |
float
|
Power consumption of the server. |
required |
server_gpu_count |
int
|
Number of available GPUs in the server. |
required |
gpu_required_count |
int
|
Number of required GPUs to load the model. |
required |
Returns:
Type | Description |
---|---|
float
|
The energy consumption of the server (GPUs are not included). |
Source code in ecologits/impacts/llm.py
request_energy(datacenter_pue, server_energy, gpu_required_count, gpu_energy)
Compute the energy consumption of the request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
datacenter_pue |
float
|
PUE of the datacenter. |
required |
server_energy |
float
|
Energy consumption of the server. |
required |
gpu_required_count |
int
|
Number of required GPUs to load the model. |
required |
gpu_energy |
float
|
Energy consumption of a single GPU. |
required |
Returns:
Type | Description |
---|---|
float
|
The energy consumption of the request. |
Source code in ecologits/impacts/llm.py
request_usage_gwp(request_energy, if_electricity_mix_gwp)
Compute the Global Warming Potential (GWP) usage impact of the request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
request_energy |
float
|
Energy consumption of the request. |
required |
if_electricity_mix_gwp |
float
|
GWP impact factor of electricity consumption. |
required |
Returns:
Type | Description |
---|---|
float
|
The GWP usage impact of the request. |
Source code in ecologits/impacts/llm.py
request_usage_adpe(request_energy, if_electricity_mix_adpe)
Compute the Abiotic Depletion Potential for Elements (ADPe) usage impact of the request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
request_energy |
float
|
Energy consumption of the request. |
required |
if_electricity_mix_adpe |
float
|
ADPe impact factor of electricity consumption. |
required |
Returns:
Type | Description |
---|---|
float
|
The ADPe usage impact of the request. |
Source code in ecologits/impacts/llm.py
request_usage_pe(request_energy, if_electricity_mix_pe)
Compute the Primary Energy (PE) usage impact of the request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
request_energy |
float
|
Energy consumption of the request. |
required |
if_electricity_mix_pe |
float
|
PE impact factor of electricity consumption. |
required |
Returns:
Type | Description |
---|---|
float
|
The PE usage impact of the request. |
Source code in ecologits/impacts/llm.py
server_gpu_embodied_gwp(server_embodied_gwp, server_gpu_count, gpu_embodied_gwp, gpu_required_count)
Compute the Global Warming Potential (GWP) embodied impact of the server
Parameters:
Name | Type | Description | Default |
---|---|---|---|
server_embodied_gwp |
float
|
GWP embodied impact of the server. |
required |
server_gpu_count |
float
|
Number of available GPUs in the server. |
required |
gpu_embodied_gwp |
float
|
GWP embodied impact of a single GPU. |
required |
gpu_required_count |
int
|
Number of required GPUs to load the model. |
required |
Returns:
Type | Description |
---|---|
float
|
The GWP embodied impact of the server and the GPUs. |
Source code in ecologits/impacts/llm.py
server_gpu_embodied_adpe(server_embodied_adpe, server_gpu_count, gpu_embodied_adpe, gpu_required_count)
Compute the Abiotic Depletion Potential for Elements (ADPe) embodied impact of the server
Parameters:
Name | Type | Description | Default |
---|---|---|---|
server_embodied_adpe |
float
|
ADPe embodied impact of the server. |
required |
server_gpu_count |
float
|
Number of available GPUs in the server. |
required |
gpu_embodied_adpe |
float
|
ADPe embodied impact of a single GPU. |
required |
gpu_required_count |
int
|
Number of required GPUs to load the model. |
required |
Returns:
Type | Description |
---|---|
float
|
The ADPe embodied impact of the server and the GPUs. |
Source code in ecologits/impacts/llm.py
server_gpu_embodied_pe(server_embodied_pe, server_gpu_count, gpu_embodied_pe, gpu_required_count)
Compute the Primary Energy (PE) embodied impact of the server
Parameters:
Name | Type | Description | Default |
---|---|---|---|
server_embodied_pe |
float
|
PE embodied impact of the server. |
required |
server_gpu_count |
float
|
Number of available GPUs in the server. |
required |
gpu_embodied_pe |
float
|
PE embodied impact of a single GPU. |
required |
gpu_required_count |
int
|
Number of required GPUs to load the model. |
required |
Returns:
Type | Description |
---|---|
float
|
The PE embodied impact of the server and the GPUs. |
Source code in ecologits/impacts/llm.py
request_embodied_gwp(server_gpu_embodied_gwp, server_lifetime, generation_latency)
Compute the Global Warming Potential (GWP) embodied impact of the request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
server_gpu_embodied_gwp |
float
|
GWP embodied impact of the server and the GPUs. |
required |
server_lifetime |
float
|
Lifetime duration of the server. |
required |
generation_latency |
float
|
Token generation latency in seconds. |
required |
Returns:
Type | Description |
---|---|
float
|
The GWP embodied impact of the request. |
Source code in ecologits/impacts/llm.py
request_embodied_adpe(server_gpu_embodied_adpe, server_lifetime, generation_latency)
Compute the Abiotic Depletion Potential for Elements (ADPe) embodied impact of the request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
server_gpu_embodied_adpe |
float
|
ADPe embodied impact of the server and the GPUs. |
required |
server_lifetime |
float
|
Lifetime duration of the server. |
required |
generation_latency |
float
|
Token generation latency in seconds. |
required |
Returns:
Type | Description |
---|---|
float
|
The ADPe embodied impact of the request. |
Source code in ecologits/impacts/llm.py
request_embodied_pe(server_gpu_embodied_pe, server_lifetime, generation_latency)
Compute the Primary Energy (PE) embodied impact of the request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
server_gpu_embodied_pe |
float
|
PE embodied impact of the server and the GPUs. |
required |
server_lifetime |
float
|
Lifetime duration of the server. |
required |
generation_latency |
float
|
Token generation latency in seconds. |
required |
Returns:
Type | Description |
---|---|
float
|
The PE embodied impact of the request. |
Source code in ecologits/impacts/llm.py
compute_llm_impacts_dag(model_active_parameter_count, model_total_parameter_count, output_token_count, request_latency, model_quantization_bits=MODEL_QUANTIZATION_BITS, gpu_energy_alpha=GPU_ENERGY_ALPHA, gpu_energy_beta=GPU_ENERGY_BETA, gpu_latency_alpha=GPU_LATENCY_ALPHA, gpu_latency_beta=GPU_LATENCY_BETA, gpu_memory=GPU_MEMORY, gpu_embodied_gwp=GPU_EMBODIED_IMPACT_GWP, gpu_embodied_adpe=GPU_EMBODIED_IMPACT_ADPE, gpu_embodied_pe=GPU_EMBODIED_IMPACT_PE, server_gpu_count=SERVER_GPUS, server_power=SERVER_POWER, server_embodied_gwp=SERVER_EMBODIED_IMPACT_GWP, server_embodied_adpe=SERVER_EMBODIED_IMPACT_ADPE, server_embodied_pe=SERVER_EMBODIED_IMPACT_PE, server_lifetime=HARDWARE_LIFESPAN, datacenter_pue=DATACENTER_PUE, if_electricity_mix_gwp=IF_ELECTRICITY_MIX_GWP, if_electricity_mix_adpe=IF_ELECTRICITY_MIX_ADPE, if_electricity_mix_pe=IF_ELECTRICITY_MIX_PE)
Compute the impacts dag of an LLM generation request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_active_parameter_count |
float
|
Number of active parameters of the model. |
required |
model_total_parameter_count |
float
|
Number of parameters of the model. |
required |
output_token_count |
float
|
Number of generated tokens. |
required |
request_latency |
float
|
Measured request latency in seconds. |
required |
model_quantization_bits |
Optional[int]
|
Number of bits used to represent the model weights. |
MODEL_QUANTIZATION_BITS
|
gpu_energy_alpha |
Optional[float]
|
Alpha parameter of the GPU linear power consumption profile. |
GPU_ENERGY_ALPHA
|
gpu_energy_beta |
Optional[float]
|
Beta parameter of the GPU linear power consumption profile. |
GPU_ENERGY_BETA
|
gpu_latency_alpha |
Optional[float]
|
Alpha parameter of the GPU linear latency profile. |
GPU_LATENCY_ALPHA
|
gpu_latency_beta |
Optional[float]
|
Beta parameter of the GPU linear latency profile. |
GPU_LATENCY_BETA
|
gpu_memory |
Optional[float]
|
Amount of memory available on a single GPU. |
GPU_MEMORY
|
gpu_embodied_gwp |
Optional[float]
|
GWP embodied impact of a single GPU. |
GPU_EMBODIED_IMPACT_GWP
|
gpu_embodied_adpe |
Optional[float]
|
ADPe embodied impact of a single GPU. |
GPU_EMBODIED_IMPACT_ADPE
|
gpu_embodied_pe |
Optional[float]
|
PE embodied impact of a single GPU. |
GPU_EMBODIED_IMPACT_PE
|
server_gpu_count |
Optional[int]
|
Number of available GPUs in the server. |
SERVER_GPUS
|
server_power |
Optional[float]
|
Power consumption of the server. |
SERVER_POWER
|
server_embodied_gwp |
Optional[float]
|
GWP embodied impact of the server. |
SERVER_EMBODIED_IMPACT_GWP
|
server_embodied_adpe |
Optional[float]
|
ADPe embodied impact of the server. |
SERVER_EMBODIED_IMPACT_ADPE
|
server_embodied_pe |
Optional[float]
|
PE embodied impact of the server. |
SERVER_EMBODIED_IMPACT_PE
|
server_lifetime |
Optional[float]
|
Lifetime duration of the server. |
HARDWARE_LIFESPAN
|
datacenter_pue |
Optional[float]
|
PUE of the datacenter. |
DATACENTER_PUE
|
if_electricity_mix_gwp |
Optional[float]
|
GWP impact factor of electricity consumption. |
IF_ELECTRICITY_MIX_GWP
|
if_electricity_mix_adpe |
Optional[float]
|
ADPe impact factor of electricity consumption. |
IF_ELECTRICITY_MIX_ADPE
|
if_electricity_mix_pe |
Optional[float]
|
PE impact factor of electricity consumption. |
IF_ELECTRICITY_MIX_PE
|
Returns:
Type | Description |
---|---|
dict[str, float]
|
The impacts dag with all intermediate states. |
Source code in ecologits/impacts/llm.py
345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 |
|
compute_llm_impacts(model_active_parameter_count, model_total_parameter_count, output_token_count, request_latency=None, **kwargs)
Compute the impacts of an LLM generation request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_active_parameter_count |
ValueOrRange
|
Number of active parameters of the model. |
required |
model_total_parameter_count |
ValueOrRange
|
Number of total parameters of the model. |
required |
output_token_count |
float
|
Number of generated tokens. |
required |
request_latency |
Optional[float]
|
Measured request latency in seconds. |
None
|
**kwargs |
Any
|
Any other optional parameter. |
{}
|
Returns:
Type | Description |
---|---|
Impacts
|
The impacts of an LLM generation request. |
Source code in ecologits/impacts/llm.py
429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 |
|